הדרה מאפשרת לך לתת לדמויות בינה מלאכותית קול – הדגם החדש הזה הוא עליית מדרגה ענקית

12:25
, 17 אוגוסט 2024
, טכנולוגיה

Hedra, פלטפורמת יצירת הדמויות בינה מלאכותית השיקה גרסה חדשה לדגם שלה והיא כוללת תנועות ראש והבעות פנים מציאותיות עוד יותר.

ישנה גם תכונת 'סטייליז' חדשה המאפשרת לך לשים את הדמות שלך בתלבושות שונות, להוסיף תפאורות ואפילו לשנות את פניה לדמות לגו או לאנימה.

Character היא משפחה חדשה של דגמי יסוד מבית Hedra שנועדו ליצור בני אדם עקביים ומציאותיים יותר באמצעות וידאו בינה מלאכותית, מה שנותן ליוצרים שליטה רבה יותר על הפלט הסופי. בתחילה, הפוקוס הוא על סינכרון שפתיים והנפשת ראש. גרסה 1.5 מוסיפה ליכולת הזו.

הדרה מציעה תוכנית חינמית נדיבה לאנשים לנסות אותה, ומצאתי שהיא עובדת עם כל תמונה אנושית כולל תמונות, ציורים ואפילו דמויות פעולה. זה הנפש את הראש והפנים בצורה מדויקת למדי בזמן למילים ששימשו ברצועת הקול.

העמדת דמות הדרה 1.5 למבחן

מבחן הדרה – יוטיוב

צפה ב-On

יצרתי חמש דמויות, תחילה דרך הנחיה תמונה עם Flux.1 שפועל באופן מקומי במחשב הנייד שלי, ולאחר מכן באמצעות ElevenLabs כדי ליצור קול לכל דמות. לבסוף, השתמשתי בקול ובתמונה עם הדרה כדי ליצור סרטון.

כל תמונה היא של אדם שמביט היישר אל ה'מצלמה' כשכל תמונה שונה מספיק כדי להיות מבחן שימושי. לאחר מכן הרצתי סטייליז על שניים מהם, דחפתי אותו כדי לראות מראה שונה לחלוטין, כמו גם שינוי עדין בתלבושת וברקע. מתחת לכל תיאור, תראה את ההנחיה המלאה שנתתי ל-Flux.1 כדי ליצור כל דמות.

1. הרופא

ראשית, יש לנו רופא בגיל העמידה שלובש חלוק מעבדה. אחר כך סגננתי אותו עם תלבושת עילאית בהירה מלאה בכתמי צבע עם רקע צבעוני.

זה לא עשה עבודה רעה. יש כמה חפצים סביב הפה ותנועת הראש מעט מוגזמת אבל היא משקפת היטב את הטון של הצליל. התאמת הקול היא לא AI, מצאתי קול מתאים ב-ElevenLabs.

A head-and-shoulders shot of Dr. Amelia Chen, a 45-year-old Asian-American female doctor, speaking directly to the camera. She's wearing a white lab coat over light blue scrubs, with a stethoscope draped around her neck. Her long black hair is neatly tied back, revealing a few strands of grey at her temples. Dr. Chen's expression is warm and reassuring, with subtle laugh lines around her eyes as she speaks. The background is slightly out of focus, showing a clean, well-lit hospital corridor. The lighting is soft and professional, emphasizing her facial features and the sincerity in her brown eyes as she addresses the viewer, likely explaining a medical concept or providing patient advice.

2. הבנאי

לאחר מכן, יש לנו אופי בונה. זה לא מבחן של מחולל התמונות בינה מלאכותית אלא את יכולות סינכרון השפתיים והנפשת תנועת הראש של הדרה.

תנועת הראש יותר טבעית כאן עם חפצים מינימליים אבל המצמוץ לא טבעי. זה עדיין שיפור ניכר בהדרה Character-1 ובכמה כלי סינכרון שפתיים אחרים של AI.

A head-and-shoulders shot of Marcus Johnson, a 38-year-old African-American male construction worker, speaking directly to the camera. He's wearing a yellow hard hat and an orange high-visibility vest over a grey t-shirt. Marcus has a strong jawline with a 5 o'clock shadow, and a small scar above his right eyebrow. His expression is confident and friendly as he talks, likely explaining a aspect of his work. Sweat beads on his forehead, and there's a smudge of dirt on his cheek, suggesting he's been actively working. The background is blurred but shows the vibrant blues and oranges of a construction site. Natural sunlight illuminates his face, casting small shadows that accentuate his features.

3. הבריסטה

לדמות הבריסטה היה המצמוץ הטבעי ביותר מבין המבחנים שרצתי. שוב הייתה לה תנועת שפתיים מוגזמת במקצת, אבל בסך הכל זה היה עיבוד טוב. זה יכול לשמור על ההתנהגות הידידותית של התמונה הראשונית ושל הקול.

A head-and-shoulders shot of Sofia Rodriguez, a 25-year-old Hispanic female barista, speaking directly to the camera. She's wearing a dark green apron over a white button-up shirt, with the top button undone. Her curly brown hair is tied back in a messy bun, with a few stray curls framing her face. Sofia's warm brown eyes are engaged and friendly as she talks, likely describing a coffee blend or brewing technique. She has small, simple silver stud earrings, and a glimpse of a delicate tattoo is visible on her right wrist. The background is a softly blurred coffee shop interior, with warm, amber lighting that highlights the left side of her face, creating a cozy atmosphere.

4. המורה

אני לא יותר מדי מרוצה מהאופן שבו התמונה הזו יצאה, נראית כמו תמונת תעודת זהות גרועה במקום סרטון של מישהו שמדבר למצלמה, אבל נראה שהריאליזם בתמונה עזר להדרה.

A head-and-shoulders shot of Mr. David Okafor, a 52-year-old Black British male high school teacher, speaking directly to the camera. He's wearing a navy blue blazer over a light blue shirt with a striped tie. His salt-and-pepper hair is cut short, and he wears rectangular glasses that reflect a bit of light. Mr. Okafor's expression is patient and engaging as he speaks, likely explaining a historical concept. Laugh lines and a few age spots are visible on his face, giving him a distinguished appearance. The background is a blurred classroom, with the edge of a whiteboard visible. The lighting is a mix of soft overhead lights and natural light from a nearby window, creating a warm, educational atmosphere.

5. החקלאי

סוף סוף פנים מבוגרות יותר. הזמנתי את האודיו ליצור הפסקה באמצע והדרה שיקפה במדויק את האנימציה של לקיחת נשימה/איסוף מחשבות.

כמו בכל הבדיקות, תנועות הפה והראש היו מוגזמות בהשוואה למציאות אבל מדובר בשיפור גדול.

A head-and-shoulders shot of Emma Larsson, a 60-year-old Scandinavian female farmer, speaking directly to the camera. She's wearing a plaid flannel shirt and a wide-brimmed sun hat that casts a slight shadow over her eyes. Emma's face is weathered from years of outdoor work, with deep laugh lines and sun spots. Her grey hair peeks out from under her hat in a practical braid. Her blue eyes are bright and passionate as she talks, likely discussing crop conditions or sustainable farming practices. The background is a blurred wheat field bathed in golden early morning light. A bead of sweat is visible on her temple, and her skin has a healthy, sun-kissed glow.

מחשבות אחרונות

כבר הייתי מעריץ של תנועת השפתיים המונפשת v1, אבל עם 1.5 הדרה תופסים את העניינים, ומוסיפים תנועת ראש והבעות פנים טבעיות יותר.

Stylise הוא גם תוספת רבת עוצמה ונותן לנו קצת תובנות לגבי מה שאנו עשויים לראות עם מודל הווידאו הסופי, הניתן לשליטה מלאה, בינה מלאכותית שפותחה על ידי Hedra.

זה מאפשר לך להתאים בקלות כל אלמנט בתמונת הדמות שלך, לשנות את מראהו או אפילו רק את הבגדים שהיא לובשת בתמונה.

הדבר היחיד שהוא צריך עכשיו הוא אפשרות למסך רחב ולפורטרט, ואז ניתן לשלב אותו טוב יותר בפרויקטים שנעשו באמצעות המספר ההולך וגדל של מוצרי וידאו בינה מלאכותית כולל Runway ו-Kling.

idan