Skip to content
Make AI Good

Graph · Campaign

Te Hiku Media Kaitiakitanga License — Māori-data-sovereignty stewardship licensing for indigenous AI (2018–ongoing)

01 · In focus

One campaign, in the field.

The structured facts the source records about Te Hiku Media Kaitiakitanga License — Māori-data-sovereignty stewardship licensing for indigenous AI (2018–ongoing), the count of declared adjacencies in the corpus, and the federation map zoomed on this node and its neighbours.

campaign

2 declared connections

Kind
Campaign
Status
active
Confidence
high
Start
2018-10
End
ongoing
Entity ID
camp-te-hiku-media-kaitiakitanga-license
Network
View in network

Tags aotearoa, new-zealand, kaitaia, oceania, polynesia, indigenous-led, maori, iwi-broadcaster, te-reo, language-revitalisation, indigenous-data-sovereignty, kaitiakitanga, data-stewardship, stewardship-license, licensing, data-governance, community-data-governance, ai-ethics, tikanga, anti-digital-colonization, papa-reo, korero-maori, whare-korero, rongo

Te Hiku Media Kaitiakitanga License — Māori-data-sovereignty stewardship licensing for indigenous AI (2018–ongoing) · 1 direct neighbour visible

02 · Connections

2 adjacencies, by relation.

Split by direction. Direct links are the ones Te Hiku Media Kaitiakitanga License — Māori-data-sovereignty stewardship licensing for indigenous AI (2018–ongoing)’s source record names; inferred backlinks are records elsewhere in the corpus that point at this entity.

Direct from this record

2 links

Links named in this entity's structured fields.

03 · Background

From the source record.

Body prose as it appears in movement-graph’s published markdown for this entity. Links to other corpus entities resolve to their graph page; links to deeper repo paths are kept as text so the page does not invent a route.

This is the sustained licensing-and-advocacy effort that Te Hiku Media — the iwi-broadcaster and indigenous-AI organisation based in Kaitaia in the Far North of Aotearoa New Zealand — has carried since the original public release of the Kaitiakitanga License on 2 October 2018. The campaign establishes a tikanga-grounded data-licensing framework under which Māori-language audio and the natural-language-processing models trained on it are stewarded by Te Hiku Media on behalf of source communities — held under the principle of kaitiakitanga ("guardianship") rather than owned as transferable intellectual property — and under which any benefit derived from the data flows back to those communities. The campaign is the principal working example in the global make-AI-good corpus of an indigenous-led organisation building frontier AI capability under a community-set legal-and-tikanga apparatus rather than under commercial open-source or proprietary licensing norms, and the principal Aotearoa New Zealand anchor for the wider Indigenous data-sovereignty movement.

The License: principle and operative terms

The Kaitiakitanga License's grounding move is to reframe the relationship between an organisation and the data it holds. The current License text on the repository's tumu branch opens with kaitiaki — Māori for "guardian, protector, and custodian" — as the relationship Te Hiku Media holds toward the data, an explicit rejection of the ownership-and-transfer logic of conventional open-source and proprietary licences. The preamble names digital colonisation as the failure mode the License is designed to resist, observing that "the majority of tangata whenua and other Indigenous peoples may not have access to the resources that enable them to benefit from open source technologies" — open-sourcing indigenous data alone, on the License's analysis, completes rather than resists that colonisation.

The four operative terms follow from the preamble: access, use, contribution, or modification of repository code requires explicit permission from Te Hiku Media as rights-holder; commercial use of the code requires further explicit permission and is not granted by default; works derived from the code remain bound by the Kaitiakitanga License; and works built using the code become themselves subject to the License. The License describes itself, in its own text, as "a work in progress" and "a living license" — versioned through public repository commits rather than presented as a finished legal artefact. Papa Reo's home page condenses the principle for downstream users: data is not owned but is cared for under kaitiakitanga, and any benefit derived from that data flows back to its source community.

Origin and political framing

Te Hiku Media's own essay Data Sovereignty and the Kaitiakitanga License carries the campaign's strongest single-sentence framing: "artificial intelligence in its current form is based on the wholesale appropriation of existing culture", and the Kaitiakitanga License is the legal-and-tikanga response to that appropriation. The same essay positions the License as grounded in tikanga rather than in the commercial open-source norms that the organisation argues do not in fact serve smaller indigenous-language communities. The political stake is named in the campaign's most-cited interview material: CTO Keoni Mahelona's verbatim "Data is the last frontier of colonization", and Chief Executive Peter-Lucas Jones's framing of the counterfactual — absent stewardship structures, Māori-language data would be "used by the very people that beat that language out of our mouths to sell it back to us as a service". Jones's broader framing of indigenous AI agency — "I don't just think about how AI can be used, I think about how we can be the makers of AI" — sits at the centre of the License's strategic purpose, with the licensing framework as the apparatus that makes that agency operational rather than only rhetorical.

Kōrero Māori: the first operational deployment

The campaign's earliest sustained operational test was the Kōrero Māori crowdsourcing operation, which collected the te reo Māori speech corpus that the Papa Reo platform was subsequently trained on. The operation's design embedded the License explicitly into the data-collection contract: every contributor was working under terms that bound the resulting corpus to the stewardship framework rather than releasing it into a commons or to a private platform. The campaign produced more than 300 hours of labelled te reo Māori speech in 10 days, with over 2,500 community contributors reading more than 200,000 phrases, demonstrating that a community-set licence with restrictive commercial terms could in fact attract the scale of voluntary participation needed for a credible automatic-speech-recognition corpus — a finding that has been carried by the campaign into its subsequent public framings as the practical refutation of the "we can't build it under such restrictive terms" objection.

Papa Reo: platform-scale operation under the License

The Papa Reo platform is the largest sustained deployment of the Kaitiakitanga License to date. The platform was established with a NZ$13 million, seven-year Strategic Science Investment Fund placement from the Ministry of Business, Innovation and Employment in October 2019 and is positioned, in its own framing, as a multilingual NLP platform whose mission is to "enable smaller indigenous language communities to develop their own speech recognition and natural language processing capabilities" under the same data-sovereignty terms. The Papa Reo production models, trained on the Kaitiakitanga-licensed corpus, reach 92 percent accuracy on te reo Māori transcription and 82 percent accuracy on bilingual te reo / English speech — figures the campaign has used to make the point that operating under community-stewardship terms is not a quality compromise but a capability claim. Te Hiku Media's downstream tools — the Kaituhi automatic-transcription system, the Whare Kōrero archive (which holds more than 30 years of digitised iwi-broadcast material and about 1,000 hours of native-speaker te reo), the Rongo pronunciation app, and the Papa Reo speech-to-text API — all operate under the same licence terms, with The Spinoff describing the campaign as the apparatus distinguishing Te Hiku Media's te reo work from extraction-model platforms such as Duolingo.

Propagation: the Taiuru derivations and the Practical Guide

The Kaitiakitanga License's first significant propagation outside Te Hiku Media's own portfolio came on 22 August 2021, when Karaitiana Taiuru published six derived Māori Data Sovereignty Licences building explicitly on Te Hiku Media's foundational artefact. Taiuru's six derivations — the Māori Data Sovereignty Licence 1.1, the Iwi Data Sovereignty Licence 1.1, the Hapū Data Sovereignty Licence 1.1, the Marae/Rūnanga Data Sovereignty Licence 1.1, the Rōpū Māori Data Sovereignty Licence 1.1, and the Whānau Māori Data Sovereignty Licence 1.1 — span the iwi-to-whānau governance levels and embed He Whakaputanga, Te Tiriti o Waitangi, the United Nations Declaration on the Rights of Indigenous Peoples, and the Te Mana Raraunga Māori Data Sovereignty Principles into the licensing scaffolding. Taiuru's stated purpose — that the licences serve to "protect Māori Data and recognise Māori Data Sovereignty rights today and for the next 1000 years" — extends the Kaitiakitanga frame from a single organisation's working apparatus into a multi-level governance vocabulary for Māoridom as a whole.

Te Hiku Media's second propagation move was the publication of A Practical Guide to creating your own Stewardship License on 8 November 2024 on the Papa Reo blog — the first explicit propagation toolkit framed for other indigenous and small-nation organisations seeking to apply the same stewardship logic to their own data and knowledge holdings. The Guide makes the campaign's core argument plain — "by simply open sourcing our data and knowledge, we further allow ourselves to be colonised digitally in the modern world" — and positions Rongo and Whare Kōrero as Te Hiku Media's in-house worked examples of the framework's application.

Reception and posture in the broader AI-good movement

The Kaitiakitanga License has been carried as an exemplar by MIT Technology Review (22 April 2022), the International Telecommunication Union's AI for Good hub (August 2022), and recurring Aotearoa coverage including The Spinoff's June 2025 feature on Māori-led AI and quantum work, where the License's explicit prohibition of using Te Hiku Media's tools "for discrimination, surveillance, or tracking" is read as a model for embedding tikanga and tino rangatiratanga into the digital foundations of the country. In the Indigenous AI Working Group's Digital Sovereignty register, Te Hiku Media is positioned as a leading working example of a sovereignty-preserving indigenous AI apparatus.

Within the make-AI-good corpus the campaign is the principal Aotearoa New Zealand anchor and the principal entity through which the Kaitiakitanga framework — and through it, the wider Indigenous data-sovereignty movement — enters the make-AI-good ecosystem. Its theory of change is that the question of who governs language and cultural data is the operative question of whether indigenous communities can be makers rather than subjects of AI; the License is the apparatus that converts that argument into a binding set of terms under which a real-world indigenous-language AI platform actually operates. As of May 2026 the campaign remains active: the License continues to evolve through the public repository, Papa Reo's product surface continues to expand under the same terms, and the Practical Guide has set up the next stage of propagation to communities and organisations beyond Aotearoa.

04 · Sources

Where this came from.

12 sources listed from the pinned corpus. Links are shown only when the source URL is a valid HTTP(S) address.

  1. github.com

    Checked 2026-05-22

    Te Hiku Media's own GitHub repository for the Kaitiakitanga License — primary source for the repository creation date of 2 October 2018 (per the GitHub API created_at field), for the public publication of the License as a versioned, community-revisable artefact, and for the description of the repository as "Repository for the development of the Kaitiakitanga License"

  2. github.com

    Checked 2026-05-22

    Te Hiku Media's own current License text on the tumu branch — primary source for the License preamble framing kaitiaki as "guardian, protector, and custodian", for the explicit anti-digital-colonisation framing ("majority of tangata whenua and other Indigenous peoples may not have access to the resources that enable them to benefit from open source technologies"), for the four operative terms (permission-based access; commercial-use restriction without explicit permission; derivative binding; use-based binding), and for the verbatim self-description of the License as "a work in progress" and "a living license"

  3. tehiku.nz

    Checked 2026-05-22

    Te Hiku Media's own essay "Data Sovereignty and the Kaitiakitanga License" — primary source for the organisation's stated critique that "artificial intelligence in its current form is based on the wholesale appropriation of existing culture" and for the framing of the License as a stewardship-based licensing apparatus grounded in tikanga rather than the commercial open-source norms it explicitly rejects

  4. papareo.nz

    Checked 2026-05-22

    Papa Reo's own home page — primary source for the verbatim Kaitiakitanga framing that "data is not owned but is cared for under the principle of kaitiakitanga and any benefit derived from data flows to the source of the data", and for the description of Papa Reo as the multilingual NLP platform operating under that licensing logic

  5. blogs.nvidia.com

    Checked 2026-05-22

    NVIDIA developer blog — primary source for the terms of the Kaitiakitanga License as practically applied (data access granted only to organisations that agree to respect Māori values, stay within the bounds of speaker consent, and pass any benefits back to Māori people), for the Kōrero Māori crowdsourcing operation producing more than 300 hours of labelled te reo Māori speech with over 2,500 contributors in 10 days under the License, and for the 92 percent te reo Māori and 82 percent bilingual te reo / English transcription-accuracy figures of the resulting Papa Reo models

  6. technologyreview.com

    Checked 2026-05-22

    MIT Technology Review feature (22 April 2022) — primary source for the international-press framing of the Kaitiakitanga License as a "data license that spells out the ground rules for future collaborations based on the Māori principle of kaitiakitanga, or guardianship", for Keoni Mahelona's verbatim quote "Data is the last frontier of colonization", and for Peter-Lucas Jones's verbatim quote that absent indigenous-stewardship structures Māori-language data would be "used by the very people that beat that language out of our mouths to sell it back to us as a service"

  7. thespinoff.co.nz

    Checked 2026-05-22

    The Spinoff feature (29 July 2022) "Inside the fight for Māori data sovereignty" — independent secondary source for the application of the Kaitiakitanga License to the Rongo pronunciation app, for Keoni Mahelona's verbatim quote "Your data is only ever used for the benefit of Māori and Māori education", and for the framing of the License as the apparatus distinguishing Te Hiku Media's te reo work from extraction-model platforms such as Duolingo

  8. thespinoff.co.nz

    Checked 2026-05-22

    The Spinoff feature (9 June 2025) on Māori-led AI and quantum work — primary source for the framing that the Kaitiakitanga License prohibits use of Te Hiku Media's tools for discrimination, surveillance, or tracking, and for the situation of the License inside the wider Aotearoa New Zealand movement to embed tikanga and tino rangatiratanga into the digital foundations of the country

  9. taiuru.co.nz

    Checked 2026-05-22

    Karaitiana Taiuru's own page on the Kaitiakitanga Māori Data Sovereignty Licences (published 22 August 2021) — primary source for the six derived licences (Māori Data Sovereignty Licence 1.1, Iwi Data Sovereignty Licence 1.1, Hapū Data Sovereignty Licence 1.1, Marae/Rūnanga Data Sovereignty Licence 1.1, Rōpū Māori Data Sovereignty Licence 1.1, Whānau Māori Data Sovereignty Licence 1.1), for the explicit attribution to Te Hiku Media's original Kaitiakitanga License as the foundational artefact, and for the verbatim purpose statement that the licences serve to "protect Māori Data and recognise Māori Data Sovereignty rights today and for the next 1000 years"

  10. blog.papareo.nz

    Checked 2026-05-22

    Papa Reo blog post "A Practical Guide to creating your own Stewardship License" (8 November 2024) — primary source for Te Hiku Media's first explicit propagation toolkit for the Kaitiakitanga model, for the verbatim framing that "by simply open sourcing our data and knowledge, we further allow ourselves to be colonised digitally in the modern world", and for the position that commercial users of indigenous data should pay royalties back to the source communities; identifies Rongo and Whare Kōrero as the named in-house projects operating under the framework

  11. itu.int

    Checked 2026-05-22

    International Telecommunication Union (ITU) AI for Good hub article (August 2022) on AI and indigenous languages — secondary corroboration that Te Hiku Media's Kaitiakitanga-License-governed te reo Māori automatic speech recognition work is treated internationally as an exemplar of indigenous-led, sovereignty-preserving language AI

  12. indigenous-ai.net

    Checked 2026-05-22

    Indigenous AI Working Group Digital Sovereignty profile — primary source for the positioning of Te Hiku Media and the Kaitiakitanga License inside the wider Indigenous data-sovereignty network and for Peter-Lucas Jones's verbatim framing "I don't just think about how AI can be used, I think about how we can be the makers of AI"

Source: entities/campaigns/camp-te-hiku-media-kaitiakitanga-license.md in movement-graph at pin 3cc1a36.