Hacker News new | past | comments | ask | show | jobs | submit | yurylifshits's comments login

There's another important contender in the space: Hunyuan model from Tencent

My company (Nim) is hosting Hunyuan model, so here's a quick test (first attempt) at "pelican riding a bycicle" via Hunyuan on Nim: https://nim.video/explore/OGs4EM3MIpW8

I think it's as good, if not better than Sora / Veo


> A whimsical pelican, adorned in oversized sunglasses and a vibrant, patterned scarf, gracefully balances on a vintage bicycle, its sleek feathers glistening in the sunlight. As it pedals joyfully down a scenic coastal path, colorful wildflowers sway gently in the breeze, and azure waves crash rhythmically against the shore. The pelican occasionally flaps its wings, adding a playful touch to its enchanting ride. In the distance, a serene sunset bathes the landscape in warm hues, while seagulls glide gracefully overhead, celebrating this delightful and lighthearted adventure of a pelican enjoying a carefree day on two wheels.

What does it produce for “A pelican riding a bicycle along a coastal path overlooking a harbor”?

Or, what do Sora and Veo produce for your verbose prompt?


If Sora is anything like Dall-e a prompt like "A pelican riding a bicycle along a coastal path overlooking a harbor" will be extended into something like the longer prompt behind the scenes. OpenAI has been augmenting image prompts from day 1.


Hard to say about SORA but the video you shared is most definitely worse than Veo.

The Pelican is doing some weird flying motion, motion blur is hiding a lack of detail, cycle is moving fast so background is blurred etc. I would even say SORA is better because I like the slow-motion and detail but it did do something very non physical.

Veo is clearly the best in this example. It has high detail but also feels the most physically grounded among the examples.


The prompt asks that it flaps its wings. So it's actually really impressive how closely it adheres (including the rest of the little details in the prompt, like the scarf). Definitely the best of the three, in my opinion.


Pretty good except the backwards body and the strange wing movement. The feeling of motion is fantastic though.


I was curious how it would perform with prompt enhancement turned off. Here's a single attempt (no regenerations etc.): https://www.youtube.com/watch?v=730cb2qozcM

If you'd like to replicate, the sign-up process was very easy and I was easily able to run a single generation attempt. Maybe later when I want to generate video I'll use prompt enhancement. Without it, the video appears to have lost a notion of direction. Most image-generation models I'm aware of do prompt-enhancement. I've seen it on Grok+Flow/Aurora and ChatGPT+DallE.

    Prompt
    A pelican riding a bicycle along a coastal path overlooking a harbor
    Seed
    15185546
    Resolution
    720×480


I mean, you didn’t SAY riding forwards…


I suppose if you reverse it would look okish


FYI your website shows me a static image on iOS 18.2 Safari. Strangely, the progress bar still appears to “loop,” but the bird isn’t moving at all.

Turning content blockers off does not make a difference.


Fwiw, it is finicky but the video played after a couple seconds (iOS 18.2 Safari).


Reddit says it is much better than Sora. Are you hosting the full version of Nunyuan? (Your video looks great.)


HunYuan is also open source / source available unless you have 100M DAU.

Then there's Lightricks LTX-1 model and Genmo's Mochi-1. Even the research CogVideoX is making progress.

Open source video AI is just getting started, but it's off to a strong start.


Our limited tests show that yes, Hunyuan is comparable or better than Sora on most prompts. Very promising model


Is it still better if you copy his whole prompt instead of half of it?


I mean, the pelican's body is backwards...


Hi zoogeny (and anyone else here) — you can try our new app Nim to address the Runway problems you describe https://alpha.nim.video

We offer both image-to-video (same situation as Runway, need a few attempts to make something awesome) and video-to-video (under the name "Restyle 2.0") - this is our newest tool and is highly reliable, i.e. you can get complex motion (kissing, handshakes, boxing, skateboarding, etc) with controllable changes to input video (changing outfits, characters, backgrounds, styles).

Unlike Runway and Kling, we currently offer a smiple UNLIMITED plan for just $10/mo. Check it out! https://alpha.nim.video


Thanks - will look into this more deeply once I am ready to start integrating generation into my tool.

Do you have an API that can be called? Are you interested in reselling your technology through 3rd party tools?


What's the maximum video dimensions your service can output? with a 1024x1024 image it exports 512x512 on the free plan.


Google started as a research project in 1995 and incorporated in 1998. Susan was an employee number 16. 25 years sounds about right.


Yes, we follow them closely. Key differences:

— Mighty's main focus is on top-down content, not messaging with organizers and among members.

— Mighty's communities are all separate, with individual accounts and sometimes even separate mobile apps. On Openland people have just one account and inbox for all their communities. Much simpler than Mighty, Slack, or Discord.


Yes, Openland is primarily built for organizers who have the audience but need better tools for onboarding, engagement, moderation, and monetization. Happy to email/chat/talk in more details on what we have. We know the pains of Slack well and have solved many of them. Feel free to reach out at yury@openland.com or https://openland.com/yury

The landing page is for community members. We'll add a separate landing page for organizers soon.


We think that sustainable people-first social networking will follow from the new business models. Openland's bet is on SaaS model for community organizers + revenue cut from member-funded communities. This business model has better incentive alignment between Openland platform, members, and organizers.


Interesting - thanks for your response.

If I'm understanding correctly, do you mean you hope for your primary customer base to be community organizers and member-based communities?


Dave, congrats for launching your own group on Openland so quickly. We'll help you find members among people who are already on Openland.


I like this analogy! Key differences from Openland to Reddit

— Most users use their real name

— Focus on direct and group chats, not posts

— Ability to save people in contacts, profile hashtags to find interesting people

— More professional-grade tools for organizers: automaton, integrations, analytics, payments, etc.


There are two business models here:

1. Community is free for members, organizers pay Openland for premium features. This is useful for businesses who use communities for volunteer management, lead generation, product sales, support, and referral programs.

2. Members pay organizers, Openland takes the cut. This is useful for various "knowledge products": premium content, consulting, coaching, courses, mastermind groups, and networking clubs.


Fair, there are groups in various languages on Openland and the group language isn't always obvious from the group name. We need to make our community recommendations more aligned with people's preferred languages.

The most useful communities on Openland are single language and well-moderated.


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: