Python Developer – Transition Repetition Analysis Module

120
ETH, DAI, USDT
+55
0 days (till May 25th, 2025)

Milestone 1 – Linguistic QA Validator (French Transition Rules)

🎯 Objective

Develop a Python module to validate batches of AI-generated French transition phrases. This module ensures:
1. No stylistically significant word repetition across transitions in a group
2. “Enfin” is used only in the final transition of each group
3. Grammatical stopwords (like "le", "de", "à", "et") are excluded from repetition checks

📁 Module Target

File: utils/validate_prompt_compliance.py

📚 Definitions

✅ "Repetition" Violation

Flag repeated meaningful words in a group of transitions.
Use a French stopword list to ignore non-stylistic words such as:
["le", "la", "les", "de", "des", "un", "une", "à", "et", "en", "du", "par", "que", "si", "ce", "sur"]

🛑 "Enfin" Misuse

Flag if “enfin” appears in any position other than the last transition in a group.

🧩 Required Functions

tokenize(text: str) -> List[str]: Normalize case, remove punctuation, return word tokens

check_transition_group(transitions: List[str]) -> Dict:
Example return:
{
  "repetition": ["par", "direction"],
  "enfin_misplaced": True
}

validate_batch(batch_outputs: List[List[str]]) -> Dict: Returns summary of violations and per-output breakdown

📤 Output Format (Example)

{
  "total_outputs": 5,
  "outputs_with_violations": 4,
  "violations_summary": {
    "repetition": {
      "count": 3,
      "affected_outputs": [1, 2, 4],
      "violated_words": ["par", "direction", "dans"]
    },
    "enfin_misplaced": {
      "count": 1,
      "affected_outputs": [3]
    }
  },
  "details": [
    {
      "output_id": 1,
      "transitions": ["Par ailleurs,", "Par contre,", "Par exemple,"],
      "violations": {"repetition": ["par"]}
    },
    {
      "output_id": 2,
      "transitions": ["Prenons la direction de Paris,", "Ensuite, prenons la direction de Lyon,", "Enfin, une note sur Marseille"],
      "violations": {"repetition": ["prenons", "direction"]}
    },
    {
      "output_id": 3,
      "transitions": ["Enfin, une annonce importante", "Puis une autre nouvelle", "Pour conclure,"],
      "violations": {"enfin_misplaced": true}
    },
    {
      "output_id": 4,
      "transitions": ["Dans un autre registre,", "Dans la même région,", "Encore dans le domaine économique,"],
      "violations": {"repetition": ["dans"]}
    },
    {
      "output_id": 5,
      "transitions": ["À noter également,", "Nous terminons avec cette info :", "Pour finir,"],
      "violations": {}
    }
  ]
}

✅ Completion Criteria

- tokenize() correctly splits and lowercases all transition text
- Repetition logic excludes stopwords
- enfin_misplaced triggers only when “enfin” is not last
- All outputs match the JSON schema above
- Module is testable and cleanly structured

🧠 Skills Required

- Python 3
- Regex and tokenization
- Set logic and dictionaries
- JSON formatting
- NLP or editorial QA experience (preferred)

120
ETH, DAI, USDT
+55
0 days (till May 25th, 2025)

More Jobs from this customer

🧾 Extract Structured Transition Triplets from DOCX Files

We are hiring a developer to build a Streamlit application that extracts structured examples of transition phrases from .docx documents containing regional French news articles. These transitions are short, context-appropriate phrases that connect ideas between...

🧾 Extract Structured Transition Triplets from DOCX Files

We are hiring a developer to build a Streamlit application that extracts structured examples of transition phrases from .docx documents containing regional French news articles. These transitions are short, context-appropriate phrases that connect ideas between...

More Jobs like this

Show more
translation

Position: Translation We are currently seeking a skilled Translator to join our team. The ideal candidate will have strong proficiency in Writing and Translation, with a focus on Spanish translation. The Translator will be responsible...

one billion seconds for sale

Hi everyone I have 1 billion seconds time farm for sale  For 100 dollars  100$   

1st Phase VPN iOS App on open source SoftEther protocol

Good day, I am reposting this project, because so far many developers claim to deliver 1st milestone but unable to deliver. Couple of Freelancers insisted to award/create a milestone but in the end, failed to...

Translator and Copy Typist (Freelance/Part-Time)

We are looking for a talented Translator and Copy Typist to join our team on a freelance/part-time basis. As a Translator and Copy Typist, you will be responsible for translating documents and text from Japanese...

Solana & BNB Chain Whale Sniper Bot Developer

## Project Summary: Development of an Advanced Whale Sniper Bot for Automated Pump Token Trading on Solana and BNB Chain   ### Objective: To build a fully automated, high-end trading bot that monitors whale transactions...

Web scraping script needed

I need a talent that can create a custom web scraper for me for me on Alibaba, pay and conditions are fully negotiable so lets chat if you have the skills!

Selling 41B $Seconds Time Farm 2 Ton for 1 Billion

You can contact me in telegram @ThreatSlayer to make a faster transact.

商务拓展

In the cryptocurrency industry and startups, BD typically is responsible for the following aspects of work:   The main responsibilities of BD include:   • Seeking cooperation opportunities: Establishing partnerships with project parties, other platforms,...

🧾 Extract Structured Transition Triplets from DOCX Files

We are hiring a developer to build a Streamlit application that extracts structured examples of transition phrases from .docx documents containing regional French news articles. These transitions are short, context-appropriate phrases that connect ideas between...

translation

Position: Translation We are currently seeking a skilled Translator to join our team. The ideal candidate will have strong proficiency in Writing and Translation, with a focus on Spanish translation. The Translator will be responsible...

one billion seconds for sale

Hi everyone I have 1 billion seconds time farm for sale  For 100 dollars  100$   

1st Phase VPN iOS App on open source SoftEther protocol

Good day, I am reposting this project, because so far many developers claim to deliver 1st milestone but unable to deliver. Couple of Freelancers insisted to award/create a milestone but in the end, failed to...

Translator and Copy Typist (Freelance/Part-Time)

We are looking for a talented Translator and Copy Typist to join our team on a freelance/part-time basis. As a Translator and Copy Typist, you will be responsible for translating documents and text from Japanese...

Solana & BNB Chain Whale Sniper Bot Developer

## Project Summary: Development of an Advanced Whale Sniper Bot for Automated Pump Token Trading on Solana and BNB Chain   ### Objective: To build a fully automated, high-end trading bot that monitors whale transactions...

Web scraping script needed

I need a talent that can create a custom web scraper for me for me on Alibaba, pay and conditions are fully negotiable so lets chat if you have the skills!

Selling 41B $Seconds Time Farm 2 Ton for 1 Billion

You can contact me in telegram @ThreatSlayer to make a faster transact.

商务拓展

In the cryptocurrency industry and startups, BD typically is responsible for the following aspects of work:   The main responsibilities of BD include:   • Seeking cooperation opportunities: Establishing partnerships with project parties, other platforms,...

🧾 Extract Structured Transition Triplets from DOCX Files

We are hiring a developer to build a Streamlit application that extracts structured examples of transition phrases from .docx documents containing regional French news articles. These transitions are short, context-appropriate phrases that connect ideas between...