Application and Research of Music Generation System Based on CVAE and Transformer-XL in Video Background Music

Jun Min, Zhiwei Gao*, Lei Wang

*Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    3 Citations (Scopus)
    20 Downloads (Pure)

    Abstract

    In the field of music generation using algorithms, processing time-series data has consistently been a complex task. To improve music generation with long sequences, insightful-unit-conditional variational autoencoder is proposed, which can enhance unit-conditional variational autoencoders with an improved attention mechanism. This model integrates TransformerXLs recurrent mechanism and relative positional encoding with measure-level granularity. For practical applications, a scheme is addressed that uses optical flow to extract motion features from video frames, quantifying motion rate and intensity. Furthermore, a dynamic correlation method is proposed to align video motion features with musical rhythm, guiding the model to generate melodies that match the videos rhythm.
    Original languageEnglish
    Pages (from-to)1409-1418
    Number of pages10
    JournalIEEE Transactions on Industrial Informatics
    Volume21
    Issue number2
    Early online date22 Oct 2024
    DOIs
    Publication statusPublished - 1 Feb 2025

    Keywords

    • Transformers
    • Computational modeling
    • Mathematical models
    • Vectors
    • Standards
    • Encoding
    • Rhythm
    • Training
    • Optical variables measurement
    • Decoding
    • Artificial intelligence (AI)
    • background music
    • conditional variational autoencoders (CVAE)
    • music generation
    • TransformerXL feature extraction

    Cite this