Python Developer – Transition Repetition Analysis Module

120
ETH, DAI, USDT
+55
0 days (till May 25th, 2025)

Milestone 1 – Linguistic QA Validator (French Transition Rules)

🎯 Objective

Develop a Python module to validate batches of AI-generated French transition phrases. This module ensures:
1. No stylistically significant word repetition across transitions in a group
2. “Enfin” is used only in the final transition of each group
3. Grammatical stopwords (like "le", "de", "à", "et") are excluded from repetition checks

📁 Module Target

File: utils/validate_prompt_compliance.py

📚 Definitions

✅ "Repetition" Violation

Flag repeated meaningful words in a group of transitions.
Use a French stopword list to ignore non-stylistic words such as:
["le", "la", "les", "de", "des", "un", "une", "à", "et", "en", "du", "par", "que", "si", "ce", "sur"]

🛑 "Enfin" Misuse

Flag if “enfin” appears in any position other than the last transition in a group.

🧩 Required Functions

tokenize(text: str) -> List[str]: Normalize case, remove punctuation, return word tokens

check_transition_group(transitions: List[str]) -> Dict:
Example return:
{
  "repetition": ["par", "direction"],
  "enfin_misplaced": True
}

validate_batch(batch_outputs: List[List[str]]) -> Dict: Returns summary of violations and per-output breakdown

📤 Output Format (Example)

{
  "total_outputs": 5,
  "outputs_with_violations": 4,
  "violations_summary": {
    "repetition": {
      "count": 3,
      "affected_outputs": [1, 2, 4],
      "violated_words": ["par", "direction", "dans"]
    },
    "enfin_misplaced": {
      "count": 1,
      "affected_outputs": [3]
    }
  },
  "details": [
    {
      "output_id": 1,
      "transitions": ["Par ailleurs,", "Par contre,", "Par exemple,"],
      "violations": {"repetition": ["par"]}
    },
    {
      "output_id": 2,
      "transitions": ["Prenons la direction de Paris,", "Ensuite, prenons la direction de Lyon,", "Enfin, une note sur Marseille"],
      "violations": {"repetition": ["prenons", "direction"]}
    },
    {
      "output_id": 3,
      "transitions": ["Enfin, une annonce importante", "Puis une autre nouvelle", "Pour conclure,"],
      "violations": {"enfin_misplaced": true}
    },
    {
      "output_id": 4,
      "transitions": ["Dans un autre registre,", "Dans la même région,", "Encore dans le domaine économique,"],
      "violations": {"repetition": ["dans"]}
    },
    {
      "output_id": 5,
      "transitions": ["À noter également,", "Nous terminons avec cette info :", "Pour finir,"],
      "violations": {}
    }
  ]
}

✅ Completion Criteria

- tokenize() correctly splits and lowercases all transition text
- Repetition logic excludes stopwords
- enfin_misplaced triggers only when “enfin” is not last
- All outputs match the JSON schema above
- Module is testable and cleanly structured

🧠 Skills Required

- Python 3
- Regex and tokenization
- Set logic and dictionaries
- JSON formatting
- NLP or editorial QA experience (preferred)

120
ETH, DAI, USDT
+55
0 days (till May 25th, 2025)

More Jobs from this customer

🧾 Extract Structured Transition Triplets from DOCX Files

We are hiring a developer to build a Streamlit application that extracts structured examples of transition phrases from .docx documents containing regional French news articles. These transitions are short, context-appropriate phrases that connect ideas between...

🧾 Extract Structured Transition Triplets from DOCX Files

We are hiring a developer to build a Streamlit application that extracts structured examples of transition phrases from .docx documents containing regional French news articles. These transitions are short, context-appropriate phrases that connect ideas between...

More Jobs like this

Show more
European Document Sourcing Specialist

We are looking for a talented European Document Sourcing Specialist to join our team. As the European Document Sourcing Specialist, you will be responsible for sourcing, writing, and translating documents for a variety of projects....

Administrative Bilingual (Spanish) Data Entry Specialist

Administrative Bilingual (Spanish) Data Entry Specialist We are seeking a highly motivated Administrative Bilingual (Spanish) Data Entry Specialist to join our team. The ideal candidate will be responsible for accurately entering and maintaining data records,...

copy write and translation job

We are looking for a skilled Copywriter and Translator to join our team. The primary responsibilities of this role include creating engaging and compelling written content for various projects, as well as translating content from...

Challenge: Build a project in Firebase Studio

For now the payment consists of small task packs worth $5 ETH each, and the payment will be by MetaMask Requirements: Dart Contact me on Telegram: t.me/LesterRF

one billion seconds for sale

Hi everyone I have 1 billion seconds time farm for sale  For 100 dollars  100$   

1st Phase VPN iOS App on open source SoftEther protocol

Good day, I am reposting this project, because so far many developers claim to deliver 1st milestone but unable to deliver. Couple of Freelancers insisted to award/create a milestone but in the end, failed to...

Solana & BNB Chain Whale Sniper Bot Developer

## Project Summary: Development of an Advanced Whale Sniper Bot for Automated Pump Token Trading on Solana and BNB Chain   ### Objective: To build a fully automated, high-end trading bot that monitors whale transactions...

Web scraping script needed

I need a talent that can create a custom web scraper for me for me on Alibaba, pay and conditions are fully negotiable so lets chat if you have the skills!

Selling 41B $Seconds Time Farm 2 Ton for 1 Billion

You can contact me in telegram @ThreatSlayer to make a faster transact.

European Document Sourcing Specialist

We are looking for a talented European Document Sourcing Specialist to join our team. As the European Document Sourcing Specialist, you will be responsible for sourcing, writing, and translating documents for a variety of projects....

Administrative Bilingual (Spanish) Data Entry Specialist

Administrative Bilingual (Spanish) Data Entry Specialist We are seeking a highly motivated Administrative Bilingual (Spanish) Data Entry Specialist to join our team. The ideal candidate will be responsible for accurately entering and maintaining data records,...

copy write and translation job

We are looking for a skilled Copywriter and Translator to join our team. The primary responsibilities of this role include creating engaging and compelling written content for various projects, as well as translating content from...

Challenge: Build a project in Firebase Studio

For now the payment consists of small task packs worth $5 ETH each, and the payment will be by MetaMask Requirements: Dart Contact me on Telegram: t.me/LesterRF

one billion seconds for sale

Hi everyone I have 1 billion seconds time farm for sale  For 100 dollars  100$   

1st Phase VPN iOS App on open source SoftEther protocol

Good day, I am reposting this project, because so far many developers claim to deliver 1st milestone but unable to deliver. Couple of Freelancers insisted to award/create a milestone but in the end, failed to...

Solana & BNB Chain Whale Sniper Bot Developer

## Project Summary: Development of an Advanced Whale Sniper Bot for Automated Pump Token Trading on Solana and BNB Chain   ### Objective: To build a fully automated, high-end trading bot that monitors whale transactions...

Web scraping script needed

I need a talent that can create a custom web scraper for me for me on Alibaba, pay and conditions are fully negotiable so lets chat if you have the skills!

Selling 41B $Seconds Time Farm 2 Ton for 1 Billion

You can contact me in telegram @ThreatSlayer to make a faster transact.