Zorse is a new open source dataset tool that aims to train and evaluate large language models for mainframe programming languages, like COBOL.
Large Language Models (LLMs) are a type of artificial intelligence (AI) that are trained on large amounts of text data and use machine learning techniques to understand and generate human language. LLMs are based on deep learning architectures, and use a set of neural networks to extract meaning from text and understand the relationships between words and phrases.
Specifically with mainframes, LLMs perform comparatively poorly at reading and writing mainframe code because there is very little data for them to train on. AI coding assistants are less useful for mainframe developers. In fact, LLMs are considerably worse at programming tasks on mainframe languages than modern ones.
Zorse aims to solve these challenges by collecting a large dataset to improve LLM’s ability to understand and write mainframe code. The project will also create an evaluation tool to measure the performance of LLMs on mainframe programming tasks.Ultimately, this will help build AI coding tools that will boost the productivity of mainframe software engineers.
Starting with a collection of permissively licensed code, Zorse will collect the source code of decommissioned mainframe systems, building a dataset on which to train LLMs.
The Zorse Project also aims to measure the ability of LLMs to read and write mainframe languages. The project will release an evaluation benchmark for COBOL called COBOLEval.
Zorse will work closely with Open Mainframe Project’s COBOL Programming Course and COBOL Check, which offers a great opportunity to blend education and practical testing for modernizing and maintaining legacy systems.