Research Participation
Effective date: April 26, 2026 Last updated: April 26, 2026
What this is
TeachLex captures behavioral data during writing, including keystroke timing, composition timelines, paste and tab-switch events, vocabulary usage, and error patterns. Most of that data exists for one classroom purpose: helping a teacher trust the work their student submitted.
But across thousands of submissions, a second-language learner research dataset is also forming. Patterns become visible at scale that no individual teacher can see in their own classroom, including how typing fluency develops as proficiency grows, how vocabulary usage distributes across CEFR levels, and how the shape of a composition session changes between A2 and B2.
That research has the potential to help second-language learners and the teachers who serve them. It can also be done badly. This page explains how TeachLex approaches it, what is and isn't included, and how you control whether your classes contribute.
The principle
Research participation is opt-in only. The default for every TeachLex account is no participation. Nothing happens unless a teacher actively chooses Yes when prompted, and consent can be revoked at any time.
This is the conservative position. It is also the position that academic ethics boards and Japanese privacy regulators expect, and the position that lets us partner with universities credibly.
What is never included
The following are never part of any research dataset, regardless of whether a teacher has consented or not:
- Student names
- Teacher names
- Institution names
- Specific class grades, scores, or rankings
- Identifiable writing samples (full essays, paragraphs, or any text long enough to be traceable)
- Any combination of attributes that could re-identify a specific student, class, or institution
- AI training data (no TeachLex data is ever used to train AI systems, by us or by Anthropic, regardless of consent)
This is not a soft commitment. It is enforced at the data extraction layer. The queries that build research datasets do not have access to these fields.
What may be included, only when a teacher consents
When a teacher chooses Yes to research participation, aggregate, anonymized patterns from their classes may be included in research datasets. Specifically:
- Population-level keystroke timing distributions, for example, how dwell and flight times distribute across thousands of B1 writers, with no link back to any individual writer
- Aggregate vocabulary usage patterns, including frequency distributions of which CEFR-level words appear in writing at each proficiency level, summed across many writers
- Composition timeline shape patterns, including the typical shape of how a piece of writing grows over a session (gradual growth versus burst patterns), aggregated across thousands of submissions
- Error type frequency distributions, including how common different error categories are at each CEFR level, in aggregate
These outputs are statistics about populations of writers, not records about individual writers.
How we use the data
Research outputs may be used for:
- Academic publications, jointly with university partners
- Public benchmarks for L2 writing process research
- Internal pedagogical tools that improve grading and feedback for all TeachLex teachers (when a tool is built using anonymized aggregate insights, those insights stay in the platform)
Research outputs are not sold. They are not licensed to data brokers. They are not used to target advertising (TeachLex carries no advertising). The principle is the same one stated elsewhere in our policies: we do not sell student data, ever.
How participation works
When you first arrive at your dashboard as a new teacher, a one-time prompt asks whether you want to contribute to language learning research. You can choose Yes, choose No, or choose to read this page first and decide later.
Whatever you decide, you can change your mind anytime from your account settings. The control has two states:
- Not participating (the default). None of your classes' data is included in any research dataset, past or future.
- Currently participating. Your classes' anonymized aggregate patterns may be included in datasets generated while consent is active. The "never included" list above still applies. Names, grades, identifiable text are still excluded.
You can switch between these states at any time. Switching to "Not participating" prevents your data from being included in any future extract. Datasets that have already been generated are immutable, but they contain only aggregate, anonymized statistics. There is no information in them that could be traced back to you, your classes, or your students.
Students and their data
Under our Privacy Policy, TeachLex acts as a processor of student data on behalf of the teacher who owns the class. Research consent operates within that same relationship. A teacher consenting to research participation is consenting on behalf of the classes they control.
This is a real responsibility. We ask teachers to consent only after reading this page, and only if they are comfortable that their institution's policies and their professional judgment support participation. If you are unsure, the right answer is to choose No. You can always change your mind later.
For students with concerns: speak to your teacher first. They control whether your class's data is part of any research extract. If you have remaining concerns after that conversation, you can email us at privacy@teachlex.com.
Research partnerships
If TeachLex enters formal research partnerships with universities or research bodies, those partnerships will be:
- Disclosed publicly on this page when active
- Conducted under appropriate institutional ethics approval
- Subject to data-sharing agreements that prohibit re-identification attempts and restrict use to the agreed research purposes
We are not currently in a formal partnership at the time of this page's publication. When that changes, this section will be updated and any teacher with active consent will be notified.
Why we are doing this
A practitioner-built ESL platform with the data infrastructure that TeachLex has is unusual. The combination of keystroke-level capture, large-scale CEFR-aligned vocabulary analysis, and longitudinal student writing in real classrooms doesn't exist in many academic datasets. There is a real chance to contribute meaningful, ethically-sourced data to a field that has historically had to make do with small, lab-based studies.
We would rather do that responsibly with a smaller pool of consenting teachers than aggressively with everyone. That is what this consent system is for.
Changes to this page
Material changes to how research data is collected or used will be:
- Reflected here with an updated "Last updated" date
- Communicated by email to teachers who have active consent at least 14 days before the change takes effect
- Subject to fresh consent if the change is significant enough that prior consent should not carry over
Contact
For questions about TeachLex research participation, email privacy@teachlex.com.
James Saunders-Wyndham Operator, TeachLex Osaka, Japan