Development of a Content Creation Model Using Natural Language Generation

Penulis

  • Paul-Odeli Jonathan Ateko Ignatius Ajuru University of Education
  • Constance I. Amannah Ignatius Ajuru University of Education
  • Henry Onyebuchukwu Ordu Ignatius Ajuru University of Education
  • Emmanuel J. Izionworu Ignatius Ajuru University of Education

DOI:

https://doi.org/10.55123/jomlai.v5i1.7726

Kata Kunci:

Natural Language Generation, Automated Content Creation, Transformer Models, Content Optimization, Ethical AI

Abstrak

The increasing demand for scalable, high-quality digital content has exposed the limitations of manual content creation and existing Natural Language Generation (NLG) systems, particularly in terms of domain specificity, ethical reliability, and readiness for optimization. This study addresses this gap by developing NLG-ACCO, a transformer-based model for automated content creation and optimization in educational, media, and digital marketing applications. Transformer-XL was selected over newer architectures like Llama-3 or Mistral because it models longer contextual dependencies beyond fixed-length segments—essential for coherent paragraph-level content—while offering a better trade-off between performance, computational efficiency, and transparency under resource-constrained conditions. The model integrates domain-aware fine-tuning, reinforcement learning, SEO optimization, and ethical safeguards, including bias detection and factual verification. Evaluation used BLEU, ROUGE, readability indices, and Perplexity. NLG-ACCO achieved a BLEU score of 0.79 (baseline: 0.61) and ROUGE-L of 0.76 (baseline: 0.36). Perplexity dropped from 45.2 to 27.8, indicating more coherent predictions. Readability improved by 24%, post-editing time decreased by 38.5%, and bias detection mitigated 87% of flagged cases. These results demonstrate that integrating optimization and ethical controls within Transformer-XL frameworks significantly enhances content quality and reliability.

Referensi

[1] Domo, Inc., Data never sleeps 11.0, 2023. [Online]. Available: https://www.domo.com/learn/data-never-sleeps-11

[2] McKinsey & Company, The state of content creation in the digital age, 2022. [Online]. Available: https://www.mckinsey.com/business-functions/marketing-and-sales/our-insights/the-state-of-content-creation

[3] A. Gatt and E. Krahmer, “Survey of the state of the art in natural language generation: Core tasks, applications and evaluation,” J. Artif. Intell. Res., vol. 61, pp. 65–170, 2018.

[4] OpenAI, GPT-4 technical report, 2023. [Online]. Available: https://openai.com/research/gpt-4

[5] C. Raffel et al., “Exploring the limits of transfer learning with a unified text-to-text transformer,” J. Mach. Learn. Res., vol. 21, no. 140, pp. 1–67, 2020.

[6] The Verge, “Wix’s AI tools boost organic traffic with automated content,” 2024.

[7] S. Lohr, “A.I. helps journalists write faster, but at what cost?” The New York Times, 2021.

[8] Wired, “The ethical dilemmas of AI-generated content,” 2024. [Online]. Available: https://www.wired.com/story/ai-content-ethical-dilemmas

[9] The Guardian, “AI-generated content and the rise of misinformation,” 2024.

[10] Z. Chen et al., “Few-shot NLG with pre-trained language model,” in Proc. 58th Annu. Meeting Assoc. Comput. Linguistics (ACL), Jul. 2020, pp. 183–190.

[11] M. Brundage et al., “Toward trustworthy AI development: Mechanisms for supporting verifiable claims,” arXiv preprint arXiv:2004.07213, 2020.

[12] T. Sellam, D. Das, and A. P. Parikh, “BLEURT: Learning robust metrics for text generation,” in Proc. 58th Annu. Meeting Assoc. Comput. Linguistics (ACL), 2020, pp. 7881–7892.

[13] I. Ni’mah et al., “NLG evaluation metrics beyond correlation analysis: An empirical metric preference checklist,” in Proc. 61st Annu. Meeting Assoc. Comput. Linguistics (ACL), vol. 1, 2023, pp. 1240–1266, doi: 10.18653/v1/2023.acl-long.69.

[14] J. Li and H. Park, “Cuisine-specific NLG for restaurant menus,” Int. J. Hosp. Manag., vol. 117, Art. no. 103678, 2023, doi: 10.1016/j.ijhm.2023.103678.

[15] C. E. Shannon, “A mathematical theory of communication,” Bell Syst. Tech. J., vol. 27, no. 3, pp. 379–423, 1948, doi: 10.1002/j.1538-7305.1948.tb01338.x.

[16] N. Chomsky, Syntactic Structures. The Hague, Netherlands: Mouton, 1957.

[17] J. R. Firth, “A synopsis of linguistic theory, 1930–1955,” in Studies in Linguistic Analysis, Oxford, U.K.: Blackwell, 1957, pp. 1–32.

[18] T. Mikolov et al., “Efficient estimation of word representations in vector space,” arXiv preprint arXiv:1301.3781, 2013.

[19] J. Pennington, R. Socher, and C. D. Manning, “GloVe: Global vectors for word representation,” in Proc. 2014 Conf. Empirical Methods Nat. Lang. Process. (EMNLP), 2014, pp. 1532–1543, doi: 10.3115/v1/D14-1162.

[20] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, 1997.

[21] K. Cho et al., “Learning phrase representations using RNN encoder–decoder for statistical machine translation,” in Proc. 2014 Conf. Empirical Methods Nat. Lang. Process. (EMNLP), 2014, pp. 1724–1734.

[22] A. Vaswani et al., “Attention is all you need,” in Adv. Neural Inf. Process. Syst., vol. 30, 2017, pp. 5998–6008.

[23] Z. Xie et al., “Data noising as smoothing in neural network language models,” in Proc. Int. Conf. Learn. Representations (ICLR), 2017.

[24] Z. Liu et al., “Topic-aware pointer-generator networks for summarizing patient-doctor conversations,” in Proc. 2020 Conf. Empirical Methods Nat. Lang. Process. (EMNLP), 2020, pp. 6380–6389, doi: 10.18653/v1/2020.emnlp-main.517.

Diterbitkan

2026-03-15

Cara Mengutip

Paul-Odeli Jonathan Ateko, Constance I. Amannah, Henry Onyebuchukwu Ordu, & Izionworu, E. J. (2026). Development of a Content Creation Model Using Natural Language Generation. JOMLAI: Journal of Machine Learning and Artificial Intelligence, 5(1), 17–27. https://doi.org/10.55123/jomlai.v5i1.7726

Terbitan

Bagian

Articles