Application and Research of Music Generation System Based on CVAE and Transformer-XL in Video Background Music

Jun Min, Zhiwei Gao*, Lei Wang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

6 Downloads (Pure)

Abstract

In the field of music generation using algorithms, processing time-series data has consistently been a complex task. To improve music generation with long sequences, insightful-unit-conditional variational autoencoder is proposed, which can enhance unit-conditional variational autoencoders with an improved attention mechanism. This model integrates TransformerXLs recurrent mechanism and relative positional encoding with measure-level granularity. For practical applications, a scheme is addressed that uses optical flow to extract motion features from video frames, quantifying motion rate and intensity. Furthermore, a dynamic correlation method is proposed to align video motion features with musical rhythm, guiding the model to generate melodies that match the videos rhythm.
Original languageEnglish
Pages (from-to)1409-1418
Number of pages10
JournalIEEE Transactions on Industrial Informatics
Volume21
Issue number2
Early online date22 Oct 2024
DOIs
Publication statusPublished - 1 Feb 2025

Keywords

  • Transformers
  • Computational modeling
  • Mathematical models
  • Vectors
  • Standards
  • Encoding
  • Rhythm
  • Training
  • Optical variables measurement
  • Decoding
  • Artificial intelligence (AI)
  • background music
  • conditional variational autoencoders (CVAE)
  • music generation
  • TransformerXL feature extraction

Cite this