Modern NLP work splits cleanly into two prompt styles: generative prompts that ask an LLM to label or extract directly, and code prompts that ask AI to write a classical scikit-learn or transformer pipeline. This topic gives you both — when to use each, and how to brief them well.
For most data science teams, NLP used to mean weeks of annotation, tokenisation, and model fine-tuning. A surprising portion of that work can now be replaced by a well-crafted prompt to a capable LLM — at least for the prototyping phase. For production-scale work, classical NLP pipelines (TF-IDF + logistic regression, spaCy, transformer encoders) remain faster, cheaper, and more predictable. The data scientist's job is to pick the right approach per task and to brief AI well in both worlds. This tutorial covers the four most common NLP tasks: sentiment analysis, text classification, entity extraction, and retrieval-augmented analytics.
NLP prompts come in two flavours. Direct LLM prompts ask the model to perform the NLP task itself — "classify this support ticket into one of these eight intents". They are zero-shot or few-shot, no training required, and excellent for prototypes or low-volume work. Code-generation prompts ask AI to write a traditional pipeline — "produce scikit-learn code that trains a TF-IDF + logistic regression sentiment classifier on this DataFrame". They are ideal when latency, cost, or interpretability matter.
An analogy: a direct LLM prompt is like hiring a freelance consultant for one job — fast, flexible, expensive per task. A code-generation prompt is like building a factory — slower to set up, cheap per item, predictable output. Most production systems eventually need the factory; most prototypes start with the consultant.
When the corpus is too large or too proprietary to fit in any prompt, the standard answer is retrieval-augmented generation (RAG): embed the documents, retrieve the relevant chunks for a question, and pass them as context. For analytics, this lets you ask natural-language questions across thousands of support tickets, contracts, or research papers. Prompt design for RAG follows the same data-brief discipline: describe the corpus, the embedding model, the chunk size, and the question shape.
Weak prompt
Do sentiment analysis on my customer reviews.
No label set ("positive/negative" or 5-star?), no domain (hotel reviews are different from software reviews), no output format (column? JSON? probabilities?). The AI will return either an over-generic Python snippet or, worse, a stream of unstructured opinions on individual reviews.
Stronger prompt
Act as a senior NLP engineer.
Task: label customer support tickets with both
sentiment and intent.
Input: DataFrame tickets_df (~18,000 rows) with columns
ticket_id (int), ticket_text (str, 20-800 words),
product_area (str), language (str, mostly 'en').
Output schema (one new column per row):
sentiment ∈ {very_negative, negative, neutral,
positive, very_positive}
intent ∈ {bug_report, feature_request, billing,
how_to, complaint, praise, other}
intent_confidence ∈ [0, 1]
short_summary (str, ≤ 18 words)
Constraints:
- For prototyping: provide a direct LLM prompt
template (system + user) that returns strict JSON.
- For production: provide scikit-learn code that
trains a TF-IDF + Logistic Regression classifier
using a labelled subset of 2,000 rows
(assume column `intent_label` exists).
- Include input validation and graceful error
handling for malformed JSON output.
The AI will produce a clean JSON-emitting system prompt for the prototype, plus a scikit-learn pipeline (Pipeline + TfidfVectorizer + LogisticRegression with class weights) for the production version, along with a JSON parser that retries on malformed output.
The pattern is: task framing → input description → output schema → constraint set → choice of approach. The output schema is the most powerful piece. If you specify the exact label set and ask for strict JSON, the LLM will emit parseable, columnar data you can drop into a DataFrame with a single json.loads per row.
For high-volume tasks, always ask AI to write both versions — a direct LLM prompt for the first 100 examples and a classical pipeline for the next 100,000. This dual-track approach gives you a working prototype on day one and a cheap, fast production system on week two.
Tip: When using direct LLM prompts in production, log the raw model output alongside the parsed labels. When the parsing breaks (and it will), you have the evidence to fix the prompt rather than the data.
Take a sample of 50 free-text customer messages. Write a direct LLM prompt that labels each one with a sentiment and an intent, returning strict JSON. Run it. Manually grade the accuracy on the first 20. Use the errors to refine the prompt.
Prompt AI to produce a scikit-learn Pipeline (TF-IDF + Logistic Regression with class_weight='balanced') for a multi-class classification problem. Specify the DataFrame columns, the target labels, and the evaluation metric (macro-F1). Compare its performance to the direct LLM approach on the same held-out set.
Design a RAG prompt for analytics: "I have 4,000 sales-call transcripts. Build a retrieval-augmented system that lets me ask questions like 'what are the top three objections raised in calls with enterprise prospects in Q3?' Specify chunking, embedding model, retrieval strategy, and answer format."
Sign in to join the discussion and post comments.
Sign inPrompt Engineering Projects & Real-World Applications
Twelve hands-on projects that turn prompt engineering theory into a portfolio. Build chatbots, content generators, RAG systems, and more.
Prompt Engineering for Content & Copywriting
Write blogs, ads, emails, and social media content ten times faster with AI. 13 practical tutorials on prompt engineering for content creators and copywriters.
Prompt Engineering for Image Generation
Turn words into stunning visuals. Master AI image generation tools like Midjourney, DALL·E 3, and Stable Diffusion with 18 focused tutorials — from first prompt to full brand identity.
Foundations of Prompt Engineering
The must-know basics of prompt engineering. Learn what prompts are, how AI models read them, and how to write clear instructions that get great results.
Prompt Engineering for Specific AI Tools
Tool-by-tool mastery — deep dives into ChatGPT, Claude, Gemini, GitHub Copilot, Midjourney, Stable Diffusion, and more. Learn the exact prompting techniques each platform rewards.
Prompt Engineering for Developers
Use AI as your coding co-pilot. 18 tutorials on writing prompts to generate clean code, debug faster, write tests, build APIs, and ship better software.