Research Participation

Effective date: April 26, 2026 Last updated: April 26, 2026

What this is

TeachLex captures behavioral data during writing, including keystroke timing, composition timelines, paste and tab-switch events, vocabulary usage, and error patterns. Most of that data exists for one classroom purpose: helping a teacher trust the work their student submitted.

But across thousands of submissions, a second-language learner research dataset is also forming. Patterns become visible at scale that no individual teacher can see in their own classroom, including how typing fluency develops as proficiency grows, how vocabulary usage distributes across CEFR levels, and how the shape of a composition session changes between A2 and B2.

That research has the potential to help second-language learners and the teachers who serve them. It can also be done badly. This page explains how TeachLex approaches it, what is and isn't included, and how you control whether your classes contribute.

The principle

Research participation is opt-in only. The default for every TeachLex account is no participation. Nothing happens unless a teacher actively chooses Yes when prompted, and consent can be revoked at any time.

This is the conservative position. It is also the position that academic ethics boards and Japanese privacy regulators expect, and the position that lets us partner with universities credibly.

What is never included

The following are never part of any research dataset, regardless of whether a teacher has consented or not:

Student names
Teacher names
Institution names
Specific class grades, scores, or rankings
Identifiable writing samples (full essays, paragraphs, or any text long enough to be traceable)
Any combination of attributes that could re-identify a specific student, class, or institution
AI training data (no TeachLex data is ever used to train AI systems, by us or by Anthropic, regardless of consent)

This is not a soft commitment. It is enforced at the data extraction layer. The queries that build research datasets do not have access to these fields.

What may be included, only when a teacher consents

When a teacher chooses Yes to research participation, aggregate, anonymized patterns from their classes may be included in research datasets. Specifically:

Population-level keystroke timing distributions, for example, how dwell and flight times distribute across thousands of B1 writers, with no link back to any individual writer
Aggregate vocabulary usage patterns, including frequency distributions of which CEFR-level words appear in writing at each proficiency level, summed across many writers
Composition timeline shape patterns, including the typical shape of how a piece of writing grows over a session (gradual growth versus burst patterns), aggregated across thousands of submissions
Error type frequency distributions, including how common different error categories are at each CEFR level, in aggregate

These outputs are statistics about populations of writers, not records about individual writers.

How we use the data

Research outputs may be used for:

Academic publications, jointly with university partners
Public benchmarks for L2 writing process research
Internal pedagogical tools that improve grading and feedback for all TeachLex teachers (when a tool is built using anonymized aggregate insights, those insights stay in the platform)

Research outputs are not sold. They are not licensed to data brokers. They are not used to target advertising (TeachLex carries no advertising). The principle is the same one stated elsewhere in our policies: we do not sell student data, ever.

How participation works

When you first arrive at your dashboard as a new teacher, a one-time prompt asks whether you want to contribute to language learning research. You can choose Yes, choose No, or choose to read this page first and decide later.

Whatever you decide, you can change your mind anytime from your account settings. The control has two states:

Not participating (the default). None of your classes' data is included in any research dataset, past or future.
Currently participating. Your classes' anonymized aggregate patterns may be included in datasets generated while consent is active. The "never included" list above still applies. Names, grades, identifiable text are still excluded.

You can switch between these states at any time. Switching to "Not participating" prevents your data from being included in any future extract. Datasets that have already been generated are immutable, but they contain only aggregate, anonymized statistics. There is no information in them that could be traced back to you, your classes, or your students.

Students and their data

Under our Privacy Policy, TeachLex acts as a processor of student data on behalf of the teacher who owns the class. Research consent operates within that same relationship. A teacher consenting to research participation is consenting on behalf of the classes they control.

This is a real responsibility. We ask teachers to consent only after reading this page, and only if they are comfortable that their institution's policies and their professional judgment support participation. If you are unsure, the right answer is to choose No. You can always change your mind later.

For students with concerns: speak to your teacher first. They control whether your class's data is part of any research extract. If you have remaining concerns after that conversation, you can email us at privacy@teachlex.com.

Research partnerships

If TeachLex enters formal research partnerships with universities or research bodies, those partnerships will be:

Disclosed publicly on this page when active
Conducted under appropriate institutional ethics approval
Subject to data-sharing agreements that prohibit re-identification attempts and restrict use to the agreed research purposes

We are not currently in a formal partnership at the time of this page's publication. When that changes, this section will be updated and any teacher with active consent will be notified.

Why we are doing this

A practitioner-built ESL platform with the data infrastructure that TeachLex has is unusual. The combination of keystroke-level capture, large-scale CEFR-aligned vocabulary analysis, and longitudinal student writing in real classrooms doesn't exist in many academic datasets. There is a real chance to contribute meaningful, ethically-sourced data to a field that has historically had to make do with small, lab-based studies.

We would rather do that responsibly with a smaller pool of consenting teachers than aggressively with everyone. That is what this consent system is for.

Changes to this page

Material changes to how research data is collected or used will be:

Reflected here with an updated "Last updated" date
Communicated by email to teachers who have active consent at least 14 days before the change takes effect
Subject to fresh consent if the change is significant enough that prior consent should not carry over

Contact

For questions about TeachLex research participation, email privacy@teachlex.com.

James Saunders-Wyndham Operator, TeachLex Osaka, Japan