Python Developer – Transition Repetition Analysis Module

120
ETH, DAI, USDT
+55
0 days (till May 25th, 2025)

Milestone 1 – Linguistic QA Validator (French Transition Rules)

🎯 Objective

Develop a Python module to validate batches of AI-generated French transition phrases. This module ensures:
1. No stylistically significant word repetition across transitions in a group
2. “Enfin” is used only in the final transition of each group
3. Grammatical stopwords (like "le", "de", "à", "et") are excluded from repetition checks

📁 Module Target

File: utils/validate_prompt_compliance.py

📚 Definitions

✅ "Repetition" Violation

Flag repeated meaningful words in a group of transitions.
Use a French stopword list to ignore non-stylistic words such as:
["le", "la", "les", "de", "des", "un", "une", "à", "et", "en", "du", "par", "que", "si", "ce", "sur"]

🛑 "Enfin" Misuse

Flag if “enfin” appears in any position other than the last transition in a group.

🧩 Required Functions

tokenize(text: str) -> List[str]: Normalize case, remove punctuation, return word tokens

check_transition_group(transitions: List[str]) -> Dict:
Example return:
{
  "repetition": ["par", "direction"],
  "enfin_misplaced": True
}

validate_batch(batch_outputs: List[List[str]]) -> Dict: Returns summary of violations and per-output breakdown

📤 Output Format (Example)

{
  "total_outputs": 5,
  "outputs_with_violations": 4,
  "violations_summary": {
    "repetition": {
      "count": 3,
      "affected_outputs": [1, 2, 4],
      "violated_words": ["par", "direction", "dans"]
    },
    "enfin_misplaced": {
      "count": 1,
      "affected_outputs": [3]
    }
  },
  "details": [
    {
      "output_id": 1,
      "transitions": ["Par ailleurs,", "Par contre,", "Par exemple,"],
      "violations": {"repetition": ["par"]}
    },
    {
      "output_id": 2,
      "transitions": ["Prenons la direction de Paris,", "Ensuite, prenons la direction de Lyon,", "Enfin, une note sur Marseille"],
      "violations": {"repetition": ["prenons", "direction"]}
    },
    {
      "output_id": 3,
      "transitions": ["Enfin, une annonce importante", "Puis une autre nouvelle", "Pour conclure,"],
      "violations": {"enfin_misplaced": true}
    },
    {
      "output_id": 4,
      "transitions": ["Dans un autre registre,", "Dans la même région,", "Encore dans le domaine économique,"],
      "violations": {"repetition": ["dans"]}
    },
    {
      "output_id": 5,
      "transitions": ["À noter également,", "Nous terminons avec cette info :", "Pour finir,"],
      "violations": {}
    }
  ]
}

✅ Completion Criteria

- tokenize() correctly splits and lowercases all transition text
- Repetition logic excludes stopwords
- enfin_misplaced triggers only when “enfin” is not last
- All outputs match the JSON schema above
- Module is testable and cleanly structured

🧠 Skills Required

- Python 3
- Regex and tokenization
- Set logic and dictionaries
- JSON formatting
- NLP or editorial QA experience (preferred)

120
ETH, DAI, USDT
+55
0 days (till May 25th, 2025)

More Jobs from this customer

🧾 Extract Structured Transition Triplets from DOCX Files

We are hiring a developer to build a Streamlit application that extracts structured examples of transition phrases from .docx documents containing regional French news articles. These transitions are short, context-appropriate phrases that connect ideas between...

🧾 Extract Structured Transition Triplets from DOCX Files

We are hiring a developer to build a Streamlit application that extracts structured examples of transition phrases from .docx documents containing regional French news articles. These transitions are short, context-appropriate phrases that connect ideas between...

More Jobs like this

Show more
system development

Description: I am looking for a developer to create an automated system that can subscribe to newsletters on multiple websites. The system should: Allow me to specify the email address used for subscription. Automatically subscribe...

1.200.000.000 second

Hi mother fucker  I sell 1.200.000.000 for 100 $  Son of bitch don't massage me for job or anything else... scammer mothe die bitch i report you  30 ton or 100 $   

Required for Educational purposes:

A smart contract that can mint tokens from an existing original contract address (new tokens must interact with the original existing token addresses). The script should be compileable and deployable through an IDE like Remix...

English to Portuguese Translator (Professional Documents)

We’re looking for a highly skilled translator to convert two important documents from English to Portuguese, maintaining tone, accuracy, and clarity. If you're passionate about language, detail-oriented, and experienced in formal translation, we want to...

CEX Setup and Customization – BiCrypto Installation

    https://github.com/CRYPTOCEX/CEX-EXCHANGE We are an emerging Crypto Centralized Exchange (CEX) startup headquartered in Austria, preparing for a global launch. Our project is focused on creating a fully functional, compliant, and scalable CEX platform using...

Bring in referrals

We’re looking for people who can bring in referrals or know where to find partners who can.   💰 Terms: 50/50 from every deposit made by a new referral during their first week. After that...

Pitch-Deck Designer

We already have a 15-slide investor deck (PDF) with solid content but the visual storytelling needs a full redesign before we meet VCs. We’re looking for a freelance presentation designer who can turn our draft...

KOL'S and BD'S

We are looking for experienced and dynamic individuals to join our team as Key Opinion Leaders (KOLs) and Business Development Specialists (BDs). As a KOL and BD, you will play a crucial role in driving...

🚀 NFT Partner Wanted (60/40 Profit Share)

We are seeking a talented and creative NFT Partner to collaborate with in selling digital art pieces online. As an NFT Partner, you will work closely with the team to showcase and market a collection...

Help with VIDEO STREAMING

I need a script that masks the OBS virtual camera to splitcam to output stream.Many services I use block the stream from obs through splitcam and is not detected as the native camera so I...

system development

Description: I am looking for a developer to create an automated system that can subscribe to newsletters on multiple websites. The system should: Allow me to specify the email address used for subscription. Automatically subscribe...

1.200.000.000 second

Hi mother fucker  I sell 1.200.000.000 for 100 $  Son of bitch don't massage me for job or anything else... scammer mothe die bitch i report you  30 ton or 100 $   

Required for Educational purposes:

A smart contract that can mint tokens from an existing original contract address (new tokens must interact with the original existing token addresses). The script should be compileable and deployable through an IDE like Remix...

English to Portuguese Translator (Professional Documents)

We’re looking for a highly skilled translator to convert two important documents from English to Portuguese, maintaining tone, accuracy, and clarity. If you're passionate about language, detail-oriented, and experienced in formal translation, we want to...

CEX Setup and Customization – BiCrypto Installation

    https://github.com/CRYPTOCEX/CEX-EXCHANGE We are an emerging Crypto Centralized Exchange (CEX) startup headquartered in Austria, preparing for a global launch. Our project is focused on creating a fully functional, compliant, and scalable CEX platform using...

Bring in referrals

We’re looking for people who can bring in referrals or know where to find partners who can.   💰 Terms: 50/50 from every deposit made by a new referral during their first week. After that...

Pitch-Deck Designer

We already have a 15-slide investor deck (PDF) with solid content but the visual storytelling needs a full redesign before we meet VCs. We’re looking for a freelance presentation designer who can turn our draft...

KOL'S and BD'S

We are looking for experienced and dynamic individuals to join our team as Key Opinion Leaders (KOLs) and Business Development Specialists (BDs). As a KOL and BD, you will play a crucial role in driving...

🚀 NFT Partner Wanted (60/40 Profit Share)

We are seeking a talented and creative NFT Partner to collaborate with in selling digital art pieces online. As an NFT Partner, you will work closely with the team to showcase and market a collection...

Help with VIDEO STREAMING

I need a script that masks the OBS virtual camera to splitcam to output stream.Many services I use block the stream from obs through splitcam and is not detected as the native camera so I...