TITLE:
Coherent Music Composition with Efficient Deep Learning Architectures and Processes
AUTHORS:
Corey Zhang
KEYWORDS:
Transformer, Attention, Long-Term Structure, Architecture
JOURNAL NAME:
Art and Design Review,
Vol.11 No.3,
August
8,
2023
ABSTRACT: In recent years, significant advancements in
music-generating deep learning models and neural networks have revolutionized
the process of composing harmonically-sounding music. One notable innovation is
the Music Transformer, a neural network that utilizes context generation and
relationship tracking in sequential input. By leveraging transformer-based
frameworks designed for handling sequential tasks and long-range functions, the
Music Transformer captures self-reference through attention and excels at finding continuations of
musical themes during training. This attention-based model offers the advantage
of being easily trainable and capable of generating musical performances with long-term
structure, as demonstrated by Google Brain’s implementation. In this study, I
will explore various instances and applications of the Music Transformer,
highlighting its ability to efficiently generate symbolic musical structures.
Additionally, I will delve into another state-of-the-art model called TonicNet,
featuring a layered architecture combining GRU and self-attention mechanisms.
TonicNet exhibits particular strength in generating music with enhanced
long-term structure, as evidenced by its superior performance in both objective
metrics and subjective evaluations. To further improve TonicNet, I will
evaluate its performance using the same metrics and propose modifications to
its hyperparameters, architecture, and dataset.