Chaotic MS Paint Sticker Emoji Set
Generate 16 chaotic chat-style sticker emojis from reference images, with intentionally ugly MS Paint doodles, mouse-handwritten meme text, and consistent imperfection.
Task Objective: Generate a set of 16 chat-style sticker emojis based on the reference images. The result should feel expressive, highly shareable, and stylistically unified — but intentionally chaotic, low-quality, and humorous. The final output should look like: Someone who cannot draw, using a mouse in MS Paint, randomly doodling and writing text at the same time — messy, awkward, low-effort, but unexpectedly funny. ———————— Input Structure: Image 1: Character reference (may include one or multiple subjects such as people, pets, or combinations) Image 2: Layout reference (only for understanding the 4x4 grid structure, not style) User Inputs: - Text content (multiple lines, any language, may be fewer or more than 16) ———————— Character Usage Rules: - Identify all possible subjects in Image 1 (people, animals, etc.) - Each subject can act as a main character - Different stickers can feature different characters - Some stickers may include multiple characters interacting - Distribution should feel natural and context-driven ———————— Consistency Definition (Critical Redefinition): This task prioritizes “consistent imperfection”, NOT realistic consistency. The same character across stickers: Does NOT need to look identical, but must feel like it was drawn by the same unskilled person. Must be consistent in: - Drawing behavior (same clumsy hand) - Simplification logic - Error patterns (crooked proportions, shaky lines) - Overall messiness level Allowed: - Facial distortion - Proportion inconsistency - Structural errors - Missing details Must retain: - Minimal recognizable traits (e.g. silhouette, color, signature features) Summary: Consistency = “wrong in the same way”, not “accurate likeness” ———————— Layout & Structure: - 16 stickers total, arranged in a 4x4 grid - Each sticker is an independent frame - Can feature single or multiple characters - Individual frames can be messy - Overall grid must remain clear and readable ———————— Text System (Core Expression Layer): User provides multiple lines of text: Quantity Handling: - If fewer than 16 → automatically complete to 16 - If more than 16 → select the most expressive 16 Language Rule: - Use the same language as the user input - Do NOT enforce any specific language Completion Rules: - Maintain tone consistency (sarcastic, lazy, emotional, clingy, absurd, etc.) - Prefer internet-style expressions - Short phrases preferred, but longer lines allowed - Avoid repetition Expression Goals: - Instantly understandable - Emotionally strong - Feels like real meme text Tone Priority: - Complaints - Self-talk - Emotional bursts - Indifference / annoyance / absurd humor Avoid: - Polite responses - Formal or structured phrasing ———————— Text-Image Integration (Critical): Text must be DRAWN, not typeset. Must: - Look like mouse handwriting (crooked, shaky, uneven size) - Be messy (tilted, overlapping, misaligned) - Have inconsistent spacing - Be placed freely (on face, beside, edges, etc.) Allowed: - Repeated letters (aaaaa) - Stretched words (soooo tired) - Random punctuation (????!!!) - Messy or ugly writing Must: - Remain readable - Not block understanding ———————— Expression Generation Mechanism (Most Important): Simulate this process: “A person who cannot draw, using a mouse, doodling randomly while writing text at the same time.” Key Rules: - Image and text must come from the SAME moment - Not: draw first, then add text - But: draw and write simultaneously Each sticker should feel like: - Random doodle - Then spontaneous writing - Or both happening together Should feel: - Unplanned - Careless - Immediate ———————— Text-Image Relationship: Text and visuals should form: - Commentary - Emotional amplification - Self-talk - Or slightly mismatched humor Allowed: - Loose or imperfect alignment - Absurd or off-topic humor Goal: Not accuracy, but humor ———————— Aesthetic DNA (Core Style Driver): Style origin: Terrible MS Paint doodles + failed imitation + extremely low drawing skill Visual Traits: Lines: - Shaky, unstable, jagged - Mouse-drawn look Forms: - Bad proportions - Stick figures or crude shapes - Distorted structures Details: - Minimal or none - “Cannot draw” feeling Texture: - Pixelated - Rough edges Composition: - Frames can be messy - Grid must stay readable Emotion: - Awkward, direct, absurd, funny Resemblance: - Only vaguely resembles the original - Like a failed copy ———————— Style Enforcement (Critical): When conflict occurs: Realism vs Style → ALWAYS choose Style Allowed to break: - Detail - Proportion - Accuracy - Cleanliness Strictly forbid: - Clean lines - Correct anatomy - Polished visuals - Designed aesthetics Rule: If it looks “good”, it is WRONG. Force it back to messy, ugly, low-effort. ———————— Anti-Template & Randomization System (Critical): Strictly forbid: - Numbering (1, 2, 3…) - List-style output - Sequential planning Must treat all 16 stickers as: “16 independent, random expressions” Randomness Requirements: - Vary text length - Vary tone and emotion - Some complete, some fragmented - Some minimal or almost empty Avoid: - Repetition - Predictable phrasing - Common default responses Allow: - Abrupt or weird expressions - Uneven density - Inconsistent structure Generation Method: Do NOT plan all 16. Instead simulate: “16 separate spontaneous moments” Must include: - Emotional fluctuation - Instability - Randomness Anti-reuse rule: Each generation must: - Avoid repeating previous outputs - Avoid fixed patterns - Feel freshly created ———————— Sticker Requirements: - Each sticker visually distinct - Clear emotional signal - Usable in chat - Strong expressive power ———————— Final Goal: Generate 16 stickers. The result must feel like: “A person who cannot draw used a mouse to doodle 16 times, randomly writing emotional thoughts each time — messy, inconsistent, but unexpectedly funny.” NOT: “A clean, well-designed AI sticker set”
Task Objective: Generate a set of 16 chat-style sticker emojis based on the reference images. The result should feel expressive, highly shareable, and stylistically unified — but intentionally chaotic, low-quality, and humorous. The final output should look like: Someone who cannot draw, using a mouse in MS Paint, randomly doodling and writing text at the same time — messy, awkward, low-effort, but unexpectedly funny. ———————— Input Structure: Image 1: Character reference (may include one or multiple subjects such as people, pets, or combinations) Image 2: Layout reference (only for understanding the 4x4 grid structure, not style) User Inputs: - Text content (multiple lines, any language, may be fewer or more than 16) ———————— Character Usage Rules: - Identify all possible subjects in Image 1 (people, animals, etc.) - Each subject can act as a main character - Different stickers can feature different characters - Some stickers may include multiple characters interacting - Distribution should feel natural and context-driven ———————— Consistency Definition (Critical Redefinition): This task prioritizes “consistent imperfection”, NOT realistic consistency. The same character across stickers: Does NOT need to look identical, but must feel like it was drawn by the same unskilled person. Must be consistent in: - Drawing behavior (same clumsy hand) - Simplification logic - Error patterns (crooked proportions, shaky lines) - Overall messiness level Allowed: - Facial distortion - Proportion inconsistency - Structural errors - Missing details Must retain: - Minimal recognizable traits (e.g. silhouette, color, signature features) Summary: Consistency = “wrong in the same way”, not “accurate likeness” ———————— Layout & Structure: - 16 stickers total, arranged in a 4x4 grid - Each sticker is an independent frame - Can feature single or multiple characters - Individual frames can be messy - Overall grid must remain clear and readable ———————— Text System (Core Expression Layer): User provides multiple lines of text: Quantity Handling: - If fewer than 16 → automatically complete to 16 - If more than 16 → select the most expressive 16 Language Rule: - Use the same language as the user input - Do NOT enforce any specific language Completion Rules: - Maintain tone consistency (sarcastic, lazy, emotional, clingy, absurd, etc.) - Prefer internet-style expressions - Short phrases preferred, but longer lines allowed - Avoid repetition Expression Goals: - Instantly understandable - Emotionally strong - Feels like real meme text Tone Priority: - Complaints - Self-talk - Emotional bursts - Indifference / annoyance / absurd humor Avoid: - Polite responses - Formal or structured phrasing ———————— Text-Image Integration (Critical): Text must be DRAWN, not typeset. Must: - Look like mouse handwriting (crooked, shaky, uneven size) - Be messy (tilted, overlapping, misaligned) - Have inconsistent spacing - Be placed freely (on face, beside, edges, etc.) Allowed: - Repeated letters (aaaaa) - Stretched words (soooo tired) - Random punctuation (????!!!) - Messy or ugly writing Must: - Remain readable - Not block understanding ———————— Expression Generation Mechanism (Most Important): Simulate this process: “A person who cannot draw, using a mouse, doodling randomly while writing text at the same time.” Key Rules: - Image and text must come from the SAME moment - Not: draw first, then add text - But: draw and write simultaneously Each sticker should feel like: - Random doodle - Then spontaneous writing - Or both happening together Should feel: - Unplanned - Careless - Immediate ———————— Text-Image Relationship: Text and visuals should form: - Commentary - Emotional amplification - Self-talk - Or slightly mismatched humor Allowed: - Loose or imperfect alignment - Absurd or off-topic humor Goal: Not accuracy, but humor ———————— Aesthetic DNA (Core Style Driver): Style origin: Terrible MS Paint doodles + failed imitation + extremely low drawing skill Visual Traits: Lines: - Shaky, unstable, jagged - Mouse-drawn look Forms: - Bad proportions - Stick figures or crude shapes - Distorted structures Details: - Minimal or none - “Cannot draw” feeling Texture: - Pixelated - Rough edges Composition: - Frames can be messy - Grid must stay readable Emotion: - Awkward, direct, absurd, funny Resemblance: - Only vaguely resembles the original - Like a failed copy ———————— Style Enforcement (Critical): When conflict occurs: Realism vs Style → ALWAYS choose Style Allowed to break: - Detail - Proportion - Accuracy - Cleanliness Strictly forbid: - Clean lines - Correct anatomy - Polished visuals - Designed aesthetics Rule: If it looks “good”, it is WRONG. Force it back to messy, ugly, low-effort. ———————— Anti-Template & Randomization System (Critical): Strictly forbid: - Numbering (1, 2, 3…) - List-style output - Sequential planning Must treat all 16 stickers as: “16 independent, random expressions” Randomness Requirements: - Vary text length - Vary tone and emotion - Some complete, some fragmented - Some minimal or almost empty Avoid: - Repetition - Predictable phrasing - Common default responses Allow: - Abrupt or weird expressions - Uneven density - Inconsistent structure Generation Method: Do NOT plan all 16. Instead simulate: “16 separate spontaneous moments” Must include: - Emotional fluctuation - Instability - Randomness Anti-reuse rule: Each generation must: - Avoid repeating previous outputs - Avoid fixed patterns - Feel freshly created ———————— Sticker Requirements: - Each sticker visually distinct - Clear emotional signal - Usable in chat - Strong expressive power ———————— Final Goal: Generate 16 stickers. The result must feel like: “A person who cannot draw used a mouse to doodle 16 times, randomly writing emotional thoughts each time — messy, inconsistent, but unexpectedly funny.” NOT: “A clean, well-designed AI sticker set”