References

Abdurahman, S., Atari, M., Karimi-Malekabadi, F., Xue, M. J., Trager, J. P., Park, P. S., … & Dehghani, M. (2024). Perils and opportunities in using large language models in psychological research. PNAS Nexus, 3 (7), pgae245. https://doi.org/10.1093/pnasnexus/pgae245

Argyle, L. P., Busby, E. C., Fulda, N., Gubler, J. R., Rytting, C., & Wingate, D. (2023). Out of one, many: Using language models to simulate human samples. Political Analysis, 31 (3), 337–351. https://doi.org/10.1017/pan.2023.2

Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM FAccT Conference. https://doi.org/10.1145/3442188.3445922

Bisbee, J., Clinton, J. D., Dorff, C., Kenkel, B., & Larson, J. M. (2024). Synthetic replacements for human survey data? The perils of large language models. Political Analysis, 32 (3), 401–416. https://doi.org/10.1017/pan.2024.5

DeVellis, R. F. (2017). Scale development: Theory and applications (4th ed.). Thousand Oaks, CA: SAGE.

Domínguez-Olmedo, R., Hardt, M., & Mendler-Dünner, C. (2024). Questioning the survey responses of large language models. In Proceedings of NeurIPS 2024.

European Parliament. (2016). Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation). Official Journal of the European Union, L119, 1–88.

Groves, R. M., Fowler, F. J., Couper, M. P., Lepkowski, J. M., Singer, E., & Tourangeau, R. (2009). Survey Methodology (2nd ed.). Hoboken, NJ: Wiley.

Hanke, V., Blanchard, T., Boenisch, F., Olatunji, I. E., Backes, M., & Dziedzic, A. (2024). Open LLMs are necessary for current private adaptations and outperform their closed alternatives. In Advances in Neural Information Processing Systems, 37. https://doi.org/10.48550/arXiv.2411.05818

Jansen, B. J., Jung, S., & Salminen, J. (2023). Employing large language models in survey research. Natural Language Processing, 4, 100020. https://doi.org/10.1016/j.nlp.2023.100020

Kang, A., Appasani, N., Zaki, M., & Neumann, K. (2024). Synthetic data generation with LLM for improved depression prediction. arXiv Preprint arXiv:2411.17672. https://doi.org/10.48550/arXiv.2411.17672

Kang, A., Appasani, N., Zaki, M., & Neumann, K. (2025). Synthetic data generation with LLM for improved depression prediction. Nature Digital Medicine, 3 (11), 156–168.

Krosnick, J. A. (1999). Survey research. Annual Review of Psychology, 50, 537–567.

Likert, R. (1932). A technique for the measurement of attitudes. Archives of Psychology, 22 (140), 1–55.

OpenAI. (2023). GPT-4 technical report. arXiv Preprint arXiv:2303.08774. https://doi.org/10.48550/arXiv.2303.08774

Shrestha, P., Koaik, F., Schnider, R., & Sayess, D. (2025). Beyond WEIRD: Can synthetic survey participants substitute for humans in global policy research? Behavioral Science & Policy, 3 (X), 1–20. https://doi.org/10.1177/23794607241311793

Suh, J., Kim, H., & Park, S. (2024). Language model fine-tuning on scaled survey data for predicting distributions of public opinions. In Proceedings of EMNLP 2024.

Wuttke, A., Aßenmacher, M., Klamm, C., Lang, M. M., Würschinger, Q., & Kreuter, F. (2025). AI conversational interviewing: Transforming surveys with LLMs as adaptive interviewers. In Proceedings of the LaTeCH-CLfL 2025 Conference. https://doi.org/10.48550/arXiv.2410.01824

Zou, Z., Mubin, O., Alnajjar, F., & Ali, L. (2024). A pilot study of measuring emotional response and perception of LLM-generated and human-generated questionnaires. Scientific Reports, 14, 2781. https://doi.org/10.1038/s41598-024-53255-1