Paraphrase Generation For Reading Comprehension


  • Faishal Januarahman School of Computing, Telkom University, Indonesia
  • Ade Romadhony School of Computing, Telkom University, Indonesia




BLEU, Human Evaluation, Paraphrase Generation, ROUGE, Reading Comprehension, Thesaurus


Reading comprehension is an assessment that tests readers understanding of a concept from the given text. The testing process is conducted by providing questions related to the content within the context of the text. The purpose of this research is to create new question variations from existing questions, and one of the methods to achieve this is by paraphrasing questions through the task of paraphrase generation. This can help ensure that readers have fully grasped a concept of a text. This study employs a traditional approach known as the thesaurus-based approach, in which the process involves substituting synonyms using the Indonesian Thesaurus dictionary. The data used consists of a list of Indonesian language reading comprehension assessment questions ranging from elementary to high school levels. To measure the quality of the generated paraphrased questions, two evaluation processes are conducted which are automatic evaluation with the scores ranging from 0-1 and human evaluation with score ranging from 1-4. The automatic evaluation includes the BLEU-4 metric, resulting in a score of 0.044, and the ROUGE-L metric, resulting an F1-score of 0.421. As for human evaluation, the obtained relevancy score is 2.533, and the fluency score is 3.186. The results from both evaluation metrics indicate that the generated paraphrased questions exhibit diverse new word choices but tend to have slightly different meanings compared to the reference questions.

GS Cited Analysis


Download data is not yet available.



Rathod, M., Tu, T., & Stasaski, K. (2022). Educational Multi-Question Generation for Reading Comprehension. In Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022).

Bolshakov, I. A., & Gelbukh, A. (2004). Synonymous paraphrasing using WordNet and internet. In Lecture Notes in Computer Science (pp. 312–323).

Zhou, J., & Bhat, S. (2021). Paraphrase Generation: a survey of the state of the art. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.

Prakash, A., Hasan, S. A., Lee, K., Datla, V. V., Qadir, A., Liu, J., & Farri, O. (2016). Neural Paraphrase Generation with Stacked Residual LSTM Networks. International Conference on Computational Linguistics, 2923–2934. Retrieved from

Lin, Z., & Wan, X. (2021). Pushing Paraphrase Away from Original Sentence: A Multi-Round Paraphrase Generation Approach. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021.

Barmawi, A. M., & Muhammad, A. (2019). Paraphrasing method based on contextual synonym substitution. Journal of ICT Research and Applications, 13(3), 257.

Gadag, A., & Sagar, B. M. (2016). N-gram based paraphrase generator from large text document. International Conference on Computation System and Information Technology for Sustainable Solutions (CSITSS).

Sugono, D., Sugiyono, Maryani, Y., Meity, D., Qodratillah, T., Budiwiyanto, A., Puspita D., Amalia D., Santoso, T. (2008). Tesaurus Bahasa Indonesia Pusat Bahasa. Departemen Pendidikan Nasional Indonesia.

Dinakaramani, A., Fam, R., Luthfi, A., & Manurung, R. (2014). Designing an Indonesian part of speech tagset and manually tagged Indonesian corpus. 2014 International Conference on Asian Language Processing (IALP).

Dong, L., Mallinson, J., Reddy, S., & Lapata, M. (2017). Learning to Paraphrase for Question Answering. EMNLP 2017.

Thompson, B. J., & Post, M. (2020). Automatic Machine Translation Evaluation in Many Languages via Zero-Shot Paraphrasing. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).

Cao, R., Zhu, S., Yang, C., Liu, C., Ma, R., Zhao, Y., . . . Yu, K. (2020). Unsupervised Dual Paraphrasing for Two-stage Semantic Parsing. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.

Quirk, C., Brockett, C., & Dolan, W. B. (2004). Monolingual machine translation for paraphrase generation. Empirical Methods in Natural Language Processing, 142–149. Retrieved from

Zhao, S., Lan, X., Liu, T., & Li, S. (2009). Application-driven statistical paraphrase generation. Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP.

Papineni, K., Roukos, S., Ward, T. J., & Zhu, W. (2002). Bleu: a Method for Automatic Evaluation of Machine Translation. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics.

Lin, C. (2004). ROUGE: a package for automatic evaluation of summaries. Meeting of the Association for Computational Linguistics, 74–81. Retrieved from


Crossmark Updates

How to Cite

Januarahman, F., & Romadhony, A. (2023). Paraphrase Generation For Reading Comprehension. Sinkron : Jurnal Dan Penelitian Teknik Informatika, 8(4), 2018-2026.