The Experimental and Computational Linguistics Ensemble Lab (ECOLE)

Language is an essential part of human experience. We use computational, experimental, and corpus linguistic methods to understand the unique role of language in our lives and in our society. The Experimental and Computational Linguistics Ensemble Lab (ECOLE) at San Francisco State University focuses on research on both cognitive and social aspects of language.

Students sitting around a table in discussion

Research

Our research projects focus on both cognitive and social aspects of language. On one hand, we study linguistic phenomena that shed light on the question of grammar architecture and the relation between language and cognition. At the same time we are interested in what language reveals about its users and about society in general.

Current Research:

Our research projects focus on both cognitive and social aspects of language. On one hand, we study linguistic phenomena that shed light on the question of grammar architecture and the relation between language and cognition. At the same time we are interested in what language reveals about its users and about society in general.

Generative AI and Large Language Models

Our most recent line of research focuses on Generative AI and Large Language Models. We study how Large Language Models perform compared to humans on a range of tasks, from world knowledge and common reasoning to text simplification.

The Syntax and Semantics of Queries

The internet gave rise to a new form of language use – queries. Yet little is known about how we formulate queries. Queries are processed by search engines as ‘word salad’, identifying keywords inside the query with little attention paid to word order. In this project we investigate the hypothesis that queries are more than ‘word salad’ and that they have syntactic and semantic structure.

World Knowledge and Lexical Meaning

What do we know when we know a word? To answer this question, we investigate properties of nominalizations, e.g. nouns morphologically related to verb, such as destruction. Results from experimental and corpus studies suggest that information about event participants is lexically encoded, thus providing theoretical support for the lexicalist approach.

Subjectivity in Language

Do you say Democrats and Republicans or Republicans and Democrats? In a series of papers we show that word order in binomials (above) and in prenominal adjectives depends on subjective preferences of the speaker: the attribute that is psychologically closer to the speaker is mentioned first.

 

Recent Publications

  • Smirnova, A. (2021). Variation in linguistic complexity and its cognitive underpinning. Proceedings of the 43rd Annual Conference of the Cognitive Science Society, 2162-2168. https://escholarship.org/uc/item/8sf5t58c

  • Smirnova, A. (2021). Evidentiality in abductive reasoning: Experimental support for a modal analysis of evidentials. Journal of Semantics, Vol. 38 (4), 531 – 570 https://doi.org/10.1093/jos/ffab013

  • Kakar, V., Kulkarni, A., Holschuh, C., Smirnova, A., & Modrek, S. (2022). Contraception information on the websites of student health centers in the United States. Contraception. DOI:https://doi.org/10.1016/j.contraception.2022.01.007

Conferences and Presentations

  • Ava Austin and Anastasia Smirnova will present on Faculty-student collaboration in the ECOLE at the National Conference on Undergraduate Research (NCUR). April 8-10, 2024. Long Beach.

  • May Reese & Anastasia Smirnova will present on Comparing ChatGPT and Humans on World Knowledge and Commonsense Reasoning Tasks at the Conference on Computer- Human Interaction (CHI). May 11-16, 2024.

  •  Malleeswari Jagabattuni (MJ) presented a collaborative research project with Paige Peterson, Baowa Li and Mele Thomas on Capturing Changes in Public Opinion on Bilingualism: Computational Analysis of Sentiment in the Media. SFSU CSU Research competition. February 29, 2024.

  • Anastasia Smirnova organized and led a panel on ChatGPT and its Impact on Higher Education. SFSU LCA event. February 22, 2023.

  • Helena Almassy, Ava Austin, Jenna Ferrario, Mikey Pagan, May Reese, and Erli Tang presented Evaluating ChatGPT Against NLU Benchmarks at UC Davis Symposium on Language Research. May 26, 2023.

  • Ava Austin presented Evaluating ChatGPT Against NLU Benchmarks at the Southern California Undergraduate Linguistics Conference in UCLA. May 27, 2023.

  • Jenna Ferrario, Mikey Pagan, Laurel Selvig, Erli Tang, Olivia Vallejo presented Implications for Text Simplification: A novel Approach to Evaluating complexity at the 47th Annual Social Science Student Symposium (S4). California State University Monterey Bay. May 5, 2022.

  • Anastasia Smirnova presented Word Order Communicates User’s Intent in Search Queries at WeCNLP 2021. October 29, 2021.

 

Contact

Please send any questions to the lab director, Professor Smirnova, via email at smirnov@sfsu.edu.

 

ECOLE Members

Lab members are SF State students who come from diverse backgrounds and bring to the table their expertise in computer science, data analysis, linguistics, psychology, and cognitive science.

Lab Director

Anastasia Smirnova, Associate Professor

Professor Smirnova’s research focuses on language as a cognitive and social phenomenon. She published on tense and modality, and the division of labor between the grammar and the lexicon. Her more recent work focuses on linguistic complexity in reduced registers.

Lab Members (Spring 2024)

Ava Austin

Ava Austin is an undergraduate Linguistics student at SFSU interested in psycholinguistics and sociolinguistics, as well as the experimental projects within the lab. Ava is currently working on a project about English loanword use in young Dutch speakers. Outside of school, interests include screenwriting, traveling, and reading science fiction.

Jenna Ferrario

Jenna recently graduated from San Francisco State University with an MA degree in Linguistics. She is primarily interested in semantics with a historical emphasis. Currently, she is working on a comprehensive study of spatial relations in Norwegian.

Maxwell Goodwin

Max completed his BA in linguistics at San Francisco State University and is currently enrolled as an MA in the Linguistics program. He is currently working as a high school humanities teacher. He is hoping to apply the skills he is gaining in computational linguistics to whatever career path he takes after his degree. His interests in linguistics include acoustic phonetics, the syntax-phonology interface, and the syntax-semantics interface. He hopes to study the ways in which the different aspects of language interact with each other more deeply in the future.

Malleeswari Jagabattuni (MJ)

MJ is a graduate student in the Linguistics Department at San Francisco State University also pursuing a post-baccalaureate certificate in Computational Linguistics. She also has an MA in Teaching English to Speakers of Other Languages and a BA in Anthropology from San Francisco State University. She has had several years of experience as an English language educator serving immigrant and refugee populations. She is looking to apply her experiences and expertise from her background as an educator in the field of computational linguistics and natural language processing. Her research interests include large language models, low-resourced languages and text simplification.

Mikey Pagán

Mikey is an M.A. student in Comparative & World Literature at SFSU where he also completed B.A.s in Linguistics and Comparative & World Literature. His research interests include semantics and sociolinguistics, especially critical discourse analyses and the ways we make and maintain communicative meaning across disciplines and registers.

Paige Peterson

Paige is pursuing an MA in Education with a Special Interest Concentration in Applied Linguistics. She received her BA from UC Santa Barbara, where she double majored in Linguistics and Sociology. Her research interests are in first and second language acquisition,

endangered language revitalization, and the role of gender in language. Paige is also passionate about expanding and strengthening bi/multilingualism in California. In her free time, she enjoys exploring San Francisco and nerding out about language with friends.

Cassia Reddig

Cassia is currently pursuing a Master of Science in Interdisciplinary Studies (emphasis in Language, Cognition, & Computation) with a Graduate Certificate in Ethical AI at San Francisco State University, where she also obtained a BA in Psychology and Computational Linguistics. She has contributed to research in the domains of cognitive psychology and media psychology through SFSU LACE Lab and Stanford Social Media Lab. She is interested in applying multidisciplinary approaches to research natural language processing and human-centered artificial intelligence.

May Reese

May is a graduate student in the linguistics department at San Francisco State University. Her current research is on Large Language Models and Natural Language Understanding benchmark tests with a focus on Japanese. Her academic interests include language typology, computational linguistics and psycholinguistics. In her free time, you’ll find her reading, playing board games or out on a cycling trip.

Julia Whelan

Julia is pursuing an MS in Computer Science and a graduate certificate in Computational Linguistics at San Francisco State University. She holds an MA in Linguistics and a Certificate of Advanced Study in TESOL/SLAT from CSU Fresno, where she completed a Master’s thesis on syllable weight. Her linguistic research interests vary from rhyme to autism, and she is excited to merge her computational skills with her linguistic research, as seen with the digital rhyming dictionary she is developing. In her free time, Julia is an avid musical theater fan and enjoys watching, reading about, and writing musicals.

Lab Alumni

  • Helena Almassy. Professor of Mathematics at Cañada College.
  • Lauren Baker. Finance manager at DTR Consulting Services. Angie Garcia. Manager at Sound Hound
  • Skyler Ilenstine. Computational Linguist at Microsoft vis DISYS Jonathan Kakama. Data Analyst at Vaco
  • Chohee Kim. Senior Software Engineer at LinkedIn
  • Rose Kitchel. Executive Assistant at the Reeds Center
  • Helena Laranetto. Machine Learning Data Linguist II, Alexa Devices at Amazon Sujung Nam. PhD Student at University of Hawaii, Honolulu
  • Jasmine Rivero. Chatbot Operations Manager at Sense
  • Amanda Robinson. Computational Linguist at Samsung
  • Ricardo Romero Sanchez. Linguistic Project Manager at Google
  • Laurel Selvig. Data Analyst at Axos Bank Olivia Vallejo. LV Quality Specialist / Linguist