AI in Education: Student Data, FERPA, and the Rush to Adopt AI Tools

School districts across the United States are adopting AI tools at unprecedented speed while operating under FERPA, a 1974 law that governs student data privacy. The regulatory framework is decades behind the technology, and students -- the least empowered stakeholders -- bear the risk.

In the fall of 2023, the Los Angeles Unified School District – the second largest in the United States, serving over 560,000 students – launched “Ed,” an AI-powered chatbot built on a customized version of GPT-4. The system was designed to provide personalized tutoring, college counseling, and family communication in multiple languages. Within six months, the district had quietly shelved the project after spending $6 million, citing performance issues and concerns about data handling. The students whose interactions had been processed through the system were never notified about what data had been collected, where it had been sent, or how long it would be retained.

Ed was not an outlier. It was a prototype for a pattern now repeating across American education: administrators under pressure to modernize adopt AI tools with inadequate data privacy review, students and parents have no meaningful input into the decision, and the regulatory framework – FERPA, a law written in 1974 – provides far less protection than most people assume.

The Family Educational Rights and Privacy Act was signed by President Ford the same year that the Altair 8800 became the first commercial personal computer. FERPA was designed to give parents access to their children’s education records and to prevent schools from disclosing those records without consent. It was not designed for an era when an AI tutoring system might process a student’s learning patterns, emotional states, writing samples, and academic struggles through servers controlled by a company whose primary business is training larger AI models.

What FERPA Actually Protects (and What It Doesn’t)

FERPA applies to all educational institutions that receive federal funding – which means virtually every public school, college, and university in the United States. The law protects “education records,” defined as records directly related to a student that are maintained by an educational institution or a party acting on its behalf.

The critical word is “maintained.” FERPA’s protections attach to records that are kept, stored, or retained. A conversation that passes through an AI system and is immediately deleted may not constitute a “maintained” record under FERPA’s framework. This creates a regulatory gap that is precisely the width of an API call: student data that transits through an AI system without being formally stored by the school may fall outside FERPA’s protection, even though it was transmitted to a third party and processed on external servers.

The School Official Exception

FERPA’s most significant exception for AI adoption is the “school official” exception, which allows schools to disclose education records to other school officials with “legitimate educational interests.” Schools can designate contractors, consultants, and volunteers as “school officials” if they perform a function for which the school would otherwise use employees and if they are under the direct control of the school with respect to the use and maintenance of education records.

This exception is the legal pathway through which most EdTech vendors – and now AI vendors – access student data. A school district that contracts with an AI tutoring platform can designate the platform as a “school official” and share student records without parental consent. The district must have an annual notification policy, and the vendor must comply with certain FERPA requirements, but the mechanism essentially allows schools to extend student data access to commercial entities through contractual designation.

The problem is that “direct control” over an AI vendor’s data practices is largely fictional. When a school district designates OpenAI, Google, or Microsoft as a school official, the district does not control the vendor’s data pipeline, server architecture, or training processes. The contractual terms may restrict data use, but the district’s ability to verify compliance is limited. As the AI provider privacy scoreboard documents, the gap between policy promises and technical architecture is often significant.

No Private Right of Action

Perhaps FERPA’s most significant limitation is procedural: the law provides no private right of action. Students and parents cannot sue a school or vendor for FERPA violations. The only enforcement mechanism is the U.S. Department of Education’s Student Privacy Policy Office (SPPO), which can investigate complaints and, in theory, withdraw federal funding from non-compliant institutions. In practice, no school has ever lost federal funding for a FERPA violation. The SPPO issued 98 complaint resolution letters in 2024, none of which involved AI-specific data practices.

This enforcement vacuum means that even when AI tools violate FERPA’s requirements, the practical consequences are minimal. Compare this with HIPAA, where privacy fines and enforcement actions have reached hundreds of millions of dollars, or the GDPR, where the enforcement apparatus has levied billions in penalties. FERPA’s enforcement is, by any measure, the weakest among major privacy frameworks.

The Khan Academy-OpenAI Partnership: A Case Study

Khan Academy’s Khanmigo, launched in partnership with OpenAI in 2023 and expanded significantly through 2024-2025, represents the most visible deployment of frontier AI in education. The platform uses GPT-4 to provide one-on-one tutoring, with the AI acting as a Socratic tutor that guides students through problems rather than providing direct answers.

From a data perspective, Khanmigo processes student interactions – questions, answers, learning patterns, areas of difficulty, writing samples, and conversational exchanges – through OpenAI’s API. Khan Academy’s terms specify that student data processed through Khanmigo is covered by OpenAI’s API data use policy, which states that API data is not used for model training. This is a meaningful distinction from OpenAI’s consumer data practices, which do include training by default.

But several data questions remain:

Retention: How long are student interaction records retained? Khan Academy’s privacy policy specifies data retention practices, but the intersection of Khan Academy’s retention, OpenAI’s API data retention (which includes up to 30 days of retention for abuse monitoring), and FERPA’s vague retention requirements creates ambiguity.

Metadata: Even if conversation content is not used for training, interaction metadata – timestamps, session duration, frequency of use, error patterns – is collected and may be used for product improvement. For students, metadata about learning patterns can be as sensitive as the conversations themselves. A student who repeatedly asks questions about a specific topic at 2 AM is revealing information about their academic struggles, study habits, and sleep patterns.

De-identification: Khan Academy states that data shared with OpenAI is de-identified. But de-identification of free-text student conversations faces the same challenges as clinical text de-identification in healthcare: the content itself can be identifying, especially for students with unusual backgrounds, specific disabilities, or distinctive writing patterns. The limitations of PII stripping in unstructured text apply with full force.

Scale: Khanmigo processed interactions from over 3 million students by early 2026. At this scale, the aggregate data – even if individually de-identified – constitutes a dataset of enormous value for understanding learning patterns, cognitive development, and educational outcomes. The question of who owns these insights remains unresolved.

State-Level Responses: The Patchwork

Recognizing FERPA’s limitations, state legislatures have enacted their own student data privacy laws, creating a complex patchwork of requirements:

California (SOPIPA, 2014, amended 2024): The Student Online Personal Information Protection Act prohibits EdTech operators from using student data for non-educational purposes, selling student data, or building advertising profiles. The 2024 amendments added specific provisions for AI, requiring EdTech vendors to disclose AI processing practices and prohibiting the use of student data for AI model training without explicit consent.

New York (Education Law 2-d, updated 2024): Requires school districts to adopt data privacy and security standards, notify parents of third-party data sharing, and maintain a data inventory. The 2024 AI guidance from the New York State Education Department required districts to conduct privacy impact assessments before deploying AI tools.

Illinois (SOPPA, 2021): Requires data processing agreements between schools and vendors, parental notification, and data breach notification. Does not specifically address AI but applies to AI vendors that process student data.

Colorado (Student Data Transparency and Security Act, updated 2025): Added AI-specific provisions requiring vendors to disclose AI processing, prohibit AI decision-making that adversely affects student educational opportunities, and maintain human oversight of AI-driven recommendations.

By early 2026, 47 states had enacted student data privacy laws of varying scope. But the patchwork creates compliance complexity for national EdTech and AI vendors, and the enforcement resources at the state level are often minimal. The strongest protection comes not from any single law but from the aggregate effect of multiple overlapping requirements – an approach that creates compliance cost but not necessarily compliance effectiveness.

The Vulnerable Population Problem

Students – particularly K-12 students – represent a uniquely vulnerable population for AI data practices:

Age and consent: FERPA gives consent rights to parents until the student turns 18 or enters a postsecondary institution. But parental consent in the AI context is often pro forma: parents receive a notification in a packet of back-to-school paperwork and are told that the school uses technology tools subject to the school’s privacy policy. Meaningful informed consent – where parents understand that their child’s interactions with an AI tutor will be processed by a company in San Francisco with specific data retention and training practices – is rare to nonexistent.

Mandatory participation: Unlike consumer AI use, which is voluntary, students often have no choice about whether to use AI tools assigned by their school. When a teacher assigns work through an AI-powered platform, opting out means opting out of the assignment. This coercive dynamic eliminates the “market” mechanism that consumer privacy advocates rely on.

Long-term data risk: Student data has a uniquely long exposure window. A learning profile created in third grade persists for decades. If that data is incorporated into a training dataset, it cannot be extracted. The model memorization problem means that specific student interactions could theoretically be recoverable from model weights years after the interaction occurred. For a child who is now 8, the consequences of today’s data exposure may not materialize until they are 28.

Developmental sensitivity: Students share things with AI tutors that they would not share with teachers, parents, or counselors. AI chatbots elicit a confessional dynamic – users disclose more to machines than to humans. When the user is a 14-year-old struggling with math, mental health, family problems, or identity questions, the data generated is extraordinarily sensitive.

What Responsible AI Adoption Looks Like in Education

For school districts and educational institutions, responsible AI adoption requires going significantly beyond FERPA’s minimums:

Pre-Deployment Assessment

Data flow mapping for every AI tool: what student data enters the system, where it is processed, who has access, and how long it is retained
Privacy impact assessments specific to AI, not generic technology assessments
Vendor evaluation using frameworks comparable to the AI compliance checklist
Review of the AI vendor’s technical architecture, not just their contractual promises

Transparency

Plain-language notification to parents and students (age-appropriate) about AI tool use, data practices, and alternatives
Published inventory of AI tools in use across the district
Annual reporting on AI data practices and any incidents

Technical Controls

Preference for AI tools that process data locally or within the district’s infrastructure
Zero-persistence architecture requirements for AI vendors – no retention of student data beyond the immediate interaction
Client-side PII stripping for any data transmitted to external AI services
Prohibition on AI tool use with student data on consumer-tier products without enterprise data protection agreements

Student Agency

Opt-out mechanisms that do not penalize students academically
Student data access and deletion rights that go beyond FERPA’s minimums
Age-appropriate AI literacy education so students understand what happens to their data

Education is the sector where the gap between AI adoption speed and privacy protection is widest. Schools are adopting AI tools under competitive pressure (“every other district has AI tutoring”) and funding pressure (“the grant requires technology integration”) without the compliance infrastructure, technical expertise, or legal resources to evaluate the privacy implications. The students caught in this gap deserve better than a 1974 law and a privacy office that has never defunded a school.

The Stealth Cloud Perspective

Students do not choose which AI tools their schools deploy, cannot meaningfully consent to data processing, and will live with the consequences of today’s data decisions for decades. Zero-persistence infrastructure is not a luxury for education technology – it is an obligation. When the user cannot protect themselves, the architecture must protect them.