Ana Brainiall

Automatically extract names, companies, and dates from text

iniciante · 8 min · Por Ana Brainiall

What NER solves that regex can't

Regex is great for rigid patterns: a ZIP code always has a fixed format, an email always has @. But people's names, companies, and dates have no fixed pattern:

NER uses a language model that learns to understand context: "the company Itaú" vs "Itaú street". Regex can't make that distinction; NER gets it right 95%+ of the time.

texto de exemplo colorido com highlights em cores diferentes — nomes em azul, em

Standard and custom entities

Public NER models (spaCy, HuggingFace) detect:

For specific domains, you can train a custom model. Examples:

Brainiall offers custom models on demand on the Business plan.

How it works under the hood (in 30 seconds)

1. Tokenization: text is broken into words and punctuation
2. POS tagging: each word receives a grammatical class (noun, verb...)
3. Contextualization: each word is converted into a vector of 768+ dimensions considering its neighbors
4. BIO classification: each token is tagged as Begin-entity, Inside-entity, or Outside. E.g.: "Pedro" (B-PER) "Silva" (I-PER) "works" (O) "at" (O) "Petrobras" (B-ORG)
5. Aggregation: consecutive B+I tokens become a single entity

Modern models (mBERT, XLM-R, multilingual DeBERTa) run this pipeline in ~10–50ms for a paragraph.

Practical use cases

Specific limitations for Portuguese

Tip: for borderline cases, always manually review 100 examples before going to production.

Integrating via API

A single endpoint returns an array of entities:

`python
import httpx
r = httpx.post(
"https://api.brainiall.com/api/nlp/ner",
json={"text": "Pedro Silva, from Petrobras, announced on January 5th."},
headers={"Authorization": "Bearer brnl-xxx"}
)
# [{"text": "Pedro Silva", "type": "PER", "start": 0, "end": 11},
# {"text": "Petrobras", "type": "ORG", "start": 16, "end": 25},
# {"text": "January 5th", "type": "DATE", "start": 40, "end": 52}]
`

Try it right now

Ask "extract people, companies, and dates from this text: [paste]" in the Brainiall chat. Or use the API at /api/nlp/ner. The Pro plan at $29 includes 10k requests/month; Business adds batch processing and custom models.

Enjoyed this course?

Unlock 17 Pro courses + 40+ AIs in chat + video, music and full Studio generation.

Go Pro · $5.99/mo

Cancel anytime · No commitment