Hacker Newsnew | past | comments | ask | show | jobs | submit | tmilard's commentslogin

https://free-visit.net A tool to create a tour of you place. A real FPS first person tour.


You said : "- detecting lines from pixels - detecting geometry in pointclouds - constructing 3D from stereo images, photogrammetry, 360 panoramas"

  ==> For me it is more something like :
   Source = crude video-or-photo pixels  (to) ===> Find simple many rectangle-surface  that are glued together one another.
This is, for me, how you really go easily to detecting rather complexes geometry of any room.


I kind of did a version of what you suggest - I think I linked to a video showing plane edges auto-detected in a pointcloud sample.

Similarly I use another algo to detect pipe runs which tend to appear as half cylinders in the pointcloud, as the scanner usually sees one side, and often the other side is hidden, hard to access, up against a wall.

So, I guess my point is the devil is in the details .. and machine learning can optimize even further on good heuristics we might come up with.

Also, when you go thru a whole pointcloud, you have a lot of data to sift thru, so you want something fairly efficient, even if your using multiple GPUs do do the heavy matmull lifting.

You can think of RL as an optimization - greatly speeding up something like monte carlo tree search, by learning to guess the best solution earlier.


I do agree with you. We have an natural eye (what you call a 'biological brain') automat that inconsciouly 'feels' the structure of a geometric of the places we enter to.

Once this layer of "natural eye automat" is programmed behind a camera, it will spit out this crude geometry : the Spacial-data-bulk (SDB). This SDB is small data.

From now on, our programs will only do reason, not on data froms camera(s) but only on this small SBD.

This is how I see it.


==> And now the LLMs, to feel Spacial knowledge, will have a very reduce dataset. This will make spacial data reasoning very less intencive than we can't imagine.


Still working an an immersive tour maker. A visit example : https://free-visit.net/fr/demo01


Personaly I have been using babylonJs for five years. And I just love it. For me it's so easy to program ( cleanest API I have ever seen) and my 3D runtime is so light, my demos work fine even on my android phone.


Web browsers add a lot of unnecessary overhead, and require dancing with quarterly changes in policies.

In general, most iOS devices are forced to use/link their proprietary JS vm API implementation. While Babylon makes it easier, it often had features NERF'd by both Apple iOS, and Alphabet Android. In the former case it is driven by a business App walled garden, and in the latter it is device design fragmentation.

I like Babylon in many ways too, but we have to acknowledge the limitations in deployment impacting end users. People often end up patching every update Mozilla/Apple/Microsoft pushes.

Thus, difficult to deploy something unaffected by platform specific codecs, media syncing, and interface hook shenanigans.

This coverage issue is trivial to handle in Unity, GoDot, and Unreal.

The App store people always want their cut, and will find convenient excuses to nudge that policy. It is the price of admission on mobile... YMMV =3


One component of my hobby web app project is a wavetable. Below are two examples of wavetables. I want it to not tax the browser so that other, latency sensitive, components do not suffer.

Would you have any suggestions on what JS/TS package to use? I built a quick prototype in three.js but I am neither a 3D person nor a web dev, so I would appreciate your advice.

Examples:

- https://audiolabs-erlangen.de/media/pages/resources/MIR/2024...

- https://images.squarespace-cdn.com/content/v1/5ee5aa63c3a410...


Personally, I wouldn't try to do DSP pipe code in VM.

1. Use global fixed 16bit 44.1kHz stereo, and raw uncompressed lossless codec (avoids gpu/hardware-codec and sound-card specific quirks)

2. Don't try to sync your audio to the gpu 24fps+ animations ( https://en.wikipedia.org/wiki/Globally_asynchronous_locally_... ). I'd just try to cheat your display by 10Hz polling a non-blocking fifo stream copy. ymmv

3. Try statically allocated fifo* buffers in wasm, and software mixers to a single output stream for local chunk playback ( https://en.wikipedia.org/wiki/Clock_domain_crossing )

* recall fixed rate producer/consumers should lock relative phase when the garbage collector decides to ruin your day, things like software FIR filters are also fine, and a single-thread output pre-mixed stream will eventually buffer though whatever abstraction the local users have setup (i.e. while the GC does its thing... playback sounds continuous.)

Inside a VM we are unfortunately at the mercy of the garbage collector, and any assumptions JIT compiled languages make. Yet wasm should be able to push io transfers fast enough for software mixers on modern cpus.

Best of luck =3


Thank you!


Working on a 3D-Editor that transforms photos of a place into an FPS game. - Editor : https://www.youtube.com/watch?v=LEsqp93sq3w - FPS Example : https://free-visit.net/fr/demo01

This has been my WE project for a long time. But it's only working really now.


A Unity for immersive visit : https://www.youtube.com/watch?v=LEsqp93sq3w


A visit generated by the tool.https://free-visit.net/fr/demo01


Impressed with Bolt3D AI model ! - Speed of the 3D generation, - Accurate 3D mesh deduction. It's a wonderfull chock.

I agree, this is the way forward: - "some photos" as imput. - Convenient, a camera is in every pocket (Smartphone).

On WE, I have been trying for years to generate 3D from photos.My tool now works well, but there is still this big problem of the time it takes to "recreate" the 3D mesh from photos. I remind that photos are in ... 2D.Not convenient. Here is an example of my Tool's generation : https://free-visit.net/fr/demo01

Here, Bolt3d takes away the 4 hours combersome work into a automatic process. Wahoo !

So Bravo to the Bolt3d team of researchers.


It look fun. Good look and feel. Resumés are quite ok. Witch AI service do you use ?


I started using openrouter.ai to have unified api for different models, for HN Distilled I use gemini-2.0-flash-001 – it has huge context window and decent price (much cheaper for the same quality than others)


A no-code video game builder: Example made in 3 hours

https://free-visit.net/fv_users/garance/vis/VisiteBNF001-004...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: