Andreas Geiger
Andreas Geiger
  • 186
  • 663 351
GraphDreamer: Compositional 3D Scene Synthesis from Scene Graphs
As pretrained text-to-image diffusion models become increasingly powerful, recent efforts have been made to distill knowledge from these text-to-image pretrained models for optimizing a text-guided 3D model. Most of the existing methods generate a holistic 3D model from a plain text input. This can be problematic when the text describes a complex scene with multiple objects, because the vectorized text embeddings are inherently unable to capture a complex description with multiple entities and relationships. Holistic 3D modeling of the entire scene further prevents accurate grounding of text entities and concepts. To address this limitation, we propose GraphDreamer, a novel framework to generate compositional 3D scenes from scene graphs, where objects are represented as nodes and their interactions as edges. By exploiting node and edge information in scene graphs, our method makes better use of the pretrained text-to-image diffusion model and is able to fully disentangle different objects without image-level supervision. To facilitate modeling of object-wise relationships, we use signed distance fields as representation and impose a constraint to avoid inter-penetration of objects. To avoid manual scene graph creation, we design a text prompt for ChatGPT to generate scene graphs based on text inputs. We conduct both qualitative and quantitative experiments to validate the effectiveness of GraphDreamer in generating high-fidelity compositional 3D scenes with disentangled object entities.
graphdreamer.github.io/
Переглядів: 314

Відео

Mip-Splatting: Alias-free 3D Gaussian Splatting
Переглядів 1,1 тис.21 день тому
Recently, 3D Gaussian Splatting has demonstrated impressive novel view synthesis results, reaching high fidelity and efficiency. However, strong artifacts can be observed when changing the sampling rate, e.g., by changing focal length or camera distance. We find that the source for this phenomenon can be attributed to the lack of 3D frequency constraints and the usage of a 2D dilation filter. T...
HUGS: Holistic Urban 3D Scene Understanding via Gaussian Splatting
Переглядів 5122 місяці тому
Holistic understanding of urban scenes based on RGB images is a challenging yet important problem. It encompasses understanding both the geometry and appearance to enable novel view synthesis, parsing semantic labels, and tracking moving objects. Despite considerable progress, existing approaches often focus on specific aspects of this task and require additional inputs such as LiDAR scans or m...
Efficient End-to-End Detection of 6-DoF Grasps for Robotic Bin Picking
Переглядів 2672 місяці тому
Bin picking is an important building block for many robotic systems, in logistics, production or in household use-cases. In recent years, machine learning methods for the prediction of 6-DoF grasps on diverse and unknown objects have shown promising progress. However, existing approaches only consider a single ground truth grasp orientation at a grasp location during training and therefore can ...
PlanT: Explainable Planning Transformers via Object-Level Representations
Переглядів 1,5 тис.Рік тому
Planning an optimal route in a complex environment requires efficient reasoning about the surrounding scene. While human drivers prioritize important objects and ignore details not relevant to the decision, learning-based planners typically extract features from dense, high-dimensional grid representations of the scene containing all vehicle and road context information. In this paper, we propo...
Constraining 3D Fields for Reconstruction and View Synthesis
Переглядів 2,8 тис.Рік тому
Talk at the ECCV 2022 workshop: "NGR-CO3D: Neural Geometry and Rendering: Advances and the Common Objects in 3D Challenge" ngr-co3d.github.io/ In this talk, I present: 1) RegNeRF: m-niemeyer.github.io/regnerf 2) MonoSDF: niujinshuchong.github.io/monosdf 3) TensoRF: apchenstu.github.io/TensoRF
Learning Robust Policies for Self-Driving
Переглядів 1 тис.Рік тому
Talk at the ECCV 2022 workshop: "AVVision: Autonomous Vehicle Vision Workshop" avvision.xyz/eccv22/ In this talk, I present: 1) TransFuser: github.com/autonomousvision/transfuser 2) PlanT: www.katrinrenz.de/plant 3) KING: lasnik.github.io/king
Generating Images and 3D Shapes
Переглядів 985Рік тому
Talk at the ECCV 2022 workshop: "Learning to Generate 3D Shapes and Scenes Workshop" learn3dg.github.io/ In this talk, I present: 1) StyleGAN-XL: sites.google.com/view/stylegan-xl 2) VoxGRAF: katjaschwarz.github.io/voxgraf 3) gDNA: xuchen-ethz.github.io/gdna
ARAH: Animatable Volume Rendering of Articulated Human SDFs
Переглядів 689Рік тому
Combining human body models with differentiable rendering has recently enabled animatable avatars of clothed humans from sparse sets of multi-view RGB videos. While state-of-the-art approaches achieve a realistic appearance with neural radiance fields (NeRF), the inferred geometry often lacks detail due to missing geometric constraints. Further, animating avatars in out-of-distribution poses is...
KING: Generating Safety-Critical Driving Scenarios for Robust Imitation via Kinematics Gradients
Переглядів 825Рік тому
Simulators offer the possibility of safe, low-cost development of self-driving systems. However, current driving simulators exhibit naïve behavior models for background traffic. Hand-tuned scenarios are typically added during simulation to induce safety-critical situations. An alternative approach is to adversarially perturb the background traffic trajectories. In this paper, we study this appr...
KING: Generating Safety-Critical Driving Scenarios for Robust Imitation via Kinematics Gradients
Переглядів 835Рік тому
Simulators offer the possibility of safe, low-cost development of self-driving systems. However, current driving simulators exhibit naïve behavior models for background traffic. Hand-tuned scenarios are typically added during simulation to induce safety-critical situations. An alternative approach is to adversarially perturb the background traffic trajectories. In this paper, we study this appr...
ARAH: Animatable Volume Rendering of Articulated Human SDFs
Переглядів 372Рік тому
Combining human body models with differentiable rendering has recently enabled animatable avatars of clothed humans from sparse sets of multi-view RGB videos. While state-of-the-art approaches achieve a realistic appearance with neural radiance fields (NeRF), the inferred geometry often lacks detail due to missing geometric constraints. Further, animating avatars in out-of-distribution poses is...
gDNA: Towards Generative Detailed Neural Avatars
Переглядів 7042 роки тому
To make 3D human avatars widely available, we must be able to generate a variety of 3D virtual humans with varied identities and shapes in arbitrary poses. This task is challenging due to the diversity of clothed body shapes, their complex articulations, and the resulting rich, yet stochastic geometric detail in clothing. Hence, current methods that represent 3D people do not provide a full gen...
On the Frequency Bias of Generative Models
Переглядів 1,4 тис.2 роки тому
The key objective of Generative Adversarial Networks (GANs) is to generate new data with the same statistics as the provided training data. However, multiple recent works show that state-of-the-art architectures yet struggle to achieve this goal. In particular, they report an elevated amount of high frequencies in the spectral statistics which makes it straightforward to distinguish real and ge...
MetaAvatar: Learning Animatable Clothed Human Models from Few Depth Images
Переглядів 6722 роки тому
In this paper, we aim to create generalizable and controllable neural signed distance fields (SDFs) that represent clothed humans from monocular depth observations. Recent advances in deep learning, especially neural implicit representations, have enabled human shape reconstruction and controllable avatar generation from different sensor inputs. However, to generate realistic cloth deformations...
ATISS: Autoregressive Transformers for Indoor Scene Synthesis
Переглядів 3,5 тис.2 роки тому
ATISS: Autoregressive Transformers for Indoor Scene Synthesis
CAMPARI: Camera-Aware Decomposed Generative Neural Radiance Fields
Переглядів 5312 роки тому
CAMPARI: Camera-Aware Decomposed Generative Neural Radiance Fields
Shape As Points: A Differentiable Poisson Solver
Переглядів 4,9 тис.2 роки тому
Shape As Points: A Differentiable Poisson Solver
CAMPARI: Camera-Aware Decomposed Generative Neural Radiance Fields
Переглядів 3102 роки тому
CAMPARI: Camera-Aware Decomposed Generative Neural Radiance Fields
Driving with Attention
Переглядів 1,1 тис.2 роки тому
Driving with Attention
Towards Animatable Human Avatars
Переглядів 9802 роки тому
Towards Animatable Human Avatars
STEP: Segmenting and Tracking Every Pixel
Переглядів 1,9 тис.2 роки тому
STEP: Segmenting and Tracking Every Pixel
Generative Neural Scene Representationsfor 3D-Aware Image Synthesis
Переглядів 2,9 тис.2 роки тому
Generative Neural Scene Representationsfor 3D-Aware Image Synthesis
KITTI-360: A Novel Dataset and Benchmarks for Urban Scene Understanding in 2D and 3D
Переглядів 4 тис.2 роки тому
KITTI-360: A Novel Dataset and Benchmarks for Urban Scene Understanding in 2D and 3D
UNISURF: Unifying Neural Implicit Surfaces and Radiance Fields for Multi-View Reconstruction
Переглядів 3,6 тис.2 роки тому
UNISURF: Unifying Neural Implicit Surfaces and Radiance Fields for Multi-View Reconstruction
SLIM: Self-Supervised LiDAR Scene Flow and Motion Segmentation
Переглядів 2 тис.2 роки тому
SLIM: Self-Supervised LiDAR Scene Flow and Motion Segmentation
SNARF: Differentiable Forward Skinning for Animating Non-Rigid Neural Implicit Shapes
Переглядів 7282 роки тому
SNARF: Differentiable Forward Skinning for Animating Non-Rigid Neural Implicit Shapes
NEAT: Neural Attention Fields for End-to-End Autonomous Driving
Переглядів 1,4 тис.2 роки тому
NEAT: Neural Attention Fields for End-to-End Autonomous Driving
KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs
Переглядів 6 тис.2 роки тому
KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs
SLIM: Self-Supervised LiDAR Scene Flow and Motion Segmentation
Переглядів 1,2 тис.2 роки тому
SLIM: Self-Supervised LiDAR Scene Flow and Motion Segmentation

КОМЕНТАРІ

  • @tienquangpham9316
    @tienquangpham9316 Місяць тому

    As a self-taught machine learning enthusiast who still struggles with ML accuracy in waste classification projects, I find Mr. Quoc Le's work on the "noisy student" technique to be very helpful. Thank you!"

  • @leslietetteh7292
    @leslietetteh7292 2 місяці тому

    Hey Andreas. I'm having a play around with diffusion models, in the interest of seeing whether real time image-to-image inference of polygon renderings to photorealistic scenes is possible. Since you specialise in automating the reverse, photorealism to 3d polygon renderings, I wonder if it would be possible for you to create a dataset that might help with this process?

  • @quake3video
    @quake3video 2 місяці тому

    saves me doing this now I have discovered someone else already did it lol.

  • @Kamran_ai
    @Kamran_ai 3 місяці тому

    Great work, but how to visualize the resultant gif of any input image and its generated mesh file?

  • @blakeedwards3582
    @blakeedwards3582 4 місяці тому

    Thank you!

  • @akshatdobhal5653
    @akshatdobhal5653 6 місяців тому

    Great Conference!

  • @kristoferkrus
    @kristoferkrus 9 місяців тому

    Nice video and great results! However, is "convolutional" part of the name because the method uses convolutions somehow, or simply because it uses a grid (or several grids), just like a CNN?

  • @graham8316
    @graham8316 10 місяців тому

    What do the colors mean in optical flow? Is it a vector to color mapping?

  • @VinodKumar-rj3kt
    @VinodKumar-rj3kt 10 місяців тому

    Cool approach.

  • @realolivertwisted
    @realolivertwisted 11 місяців тому

    I watched a lot of his lectures before he was banned. Any chance you know where he ended up? Haven’t checked alt platforms yet but I’m going to. (Found this video by Binging “Michael Black ideological subversion” and scrolling WAAAY down in the videos.) He didn’t (mysteriously/suspiciously) die like Yuri B., did he? 😬🙏🏼

  • @seantan4702
    @seantan4702 11 місяців тому

    impressive

  • @yanzhanchen9210
    @yanzhanchen9210 Рік тому

    It is so cool!

  • @LilGnomeGames
    @LilGnomeGames Рік тому

    I would love to see where this project ended up or where it is today.

  • @chi-yaohuang3257
    @chi-yaohuang3257 Рік тому

    Great work!

  • @muhammadichsan914
    @muhammadichsan914 Рік тому

    I have an own point cloud file (ply format), so how to inference it to mesh? since the demo, it has pointcloud.npz instead of ply format.

  • @warzeo8869
    @warzeo8869 Рік тому

    bang beli dimana bang

  • @vornamenachname906
    @vornamenachname906 Рік тому

    yeah its called overfitting ...

  • @BlakeEdwards333
    @BlakeEdwards333 Рік тому

    What labeling tool did you use for the 3D object tracklets?

  • @rohitdhankar360
    @rohitdhankar360 Рік тому

    16:40 - Avoid THRESHOLDING .

  • @zahidulislam7189
    @zahidulislam7189 Рік тому

    Thank you. Inspiring works.

  • @zahidulislam7189
    @zahidulislam7189 Рік тому

    Super work. Learned many things. Thanks for the presentation.

  • @nhonth2011
    @nhonth2011 Рік тому

    Great video! Thank you.

  • @yongpengchang7238
    @yongpengchang7238 Рік тому

    Hi Andreas, I'm interested in RVC 2022 talks and I wonder if you upload the RVC 2022 video here, thank you!

  • @joederksen7441
    @joederksen7441 Рік тому

    Hi Andreas, thanks for the video, I was wondering in the TensoRF model if you can easily compute the gradient of the density function at any point, (i.e. the surface normal vector)? With a NeRF I can compute the gradient of sigma using autodiff on the neural network. How does this translate to TensoRF ?

    • @cvlibs
      @cvlibs Рік тому

      This should also be possible, autodiff then basically differentiates through the (bi-/trilinear) interpolation function.

  • @thetomer9786
    @thetomer9786 Рік тому

    What algorithm did you use for that?

  • @jakobpcoder
    @jakobpcoder Рік тому

    Wirds dieses Semester noch neue Vorlesungen im Bereich Computer Vision, Maschine Learning etc. geben? Die aus dem letzen Jahr hab ich alle schon geschaut, freiwillig, komme aus Hamburg.

  • @jakobpcoder
    @jakobpcoder Рік тому

    Großartig! Sehr cool, dass der Code verfügbar ist, obwohl jemand von Mercedes die Dinger im Spiel hatte :D

  • @ahmetfurkanaknc8959
    @ahmetfurkanaknc8959 Рік тому

    Thank you, excellent explanation.

  • @ramakrishnamutalikdesai3491

    can u share the code plz

  • @sanje1285
    @sanje1285 Рік тому

    Thank you for sharing your great presentation. I have a question. At 25:51 , on the right side of the figure, I can not understand the meaning of the black & white dots on the vector flow and semantic BEV. I think the black dots mean the points (x,y) located in psedo random position on the BEV to generate the vector from the position to the target waypoint. I would like you to answer about my question to make it clear.

  • @LiChengqi
    @LiChengqi Рік тому

    👍

  • @bernhard_jaeger
    @bernhard_jaeger Рік тому

    Nice video presentation.

  • @8eck
    @8eck Рік тому

    Just wow! This is awesome! Thank you for your hard work. This is helping people like me, to do self study and improve my skills on computer vision.

  • @4dvovo
    @4dvovo Рік тому

    Thank you Johannes Schönberger for this fantastic Open source SFM pipeline software !

  • @hyeon-jinlee8716
    @hyeon-jinlee8716 2 роки тому

    This presentation is amazing! It is very beneficial to understand from previous work (NeRF) to recent work (Giraffe), including current work (GRAF). And awesome work. Thank you very much!

  • @zosters9407
    @zosters9407 2 роки тому

    Can you please share the code for semantic segmentation on KITTI 360 dataset

  • @huytruong31127
    @huytruong31127 2 роки тому

    I need the way how to make 3D picture from 2D picture? Can you make a video for this?

  • @rabailrana9916
    @rabailrana9916 2 роки тому

    please sir let me know where the complete dataset is.

  • @nirbhay_raghav
    @nirbhay_raghav 2 роки тому

    I can't believe this was 11 years ago!! Imagine the tech the CIA has to all of these. Drive a car around and create such detailed maps.

  • @MarcusVinicius-lq3fe
    @MarcusVinicius-lq3fe 2 роки тому

    great!

  • @nirbhay_raghav
    @nirbhay_raghav 2 роки тому

    How do they compare with implict representations using sinusoidal activation?

  • @user-oj4hr5rh6i
    @user-oj4hr5rh6i 2 роки тому

    Good idea to use some active contour concepts

  • @lesleyHsieh
    @lesleyHsieh 2 роки тому

    Fantastic research! The presentation is very clear. Thank you very much for the sharing!

  • @blakeedwards3582
    @blakeedwards3582 2 роки тому

    thank you

  • @user-im8gv6eh2y
    @user-im8gv6eh2y 2 роки тому

    fantastic

  • @vertex.shader
    @vertex.shader 2 роки тому

    are you just raymarching at 7:23 ?

  • @LL-rn8rn
    @LL-rn8rn 2 роки тому

    How does it compare to detectron2's segmentation ?

  • @aeaxao9973
    @aeaxao9973 2 роки тому

    Great job!

  • @MeKaashu
    @MeKaashu 2 роки тому

    1:05 ironic how Toyota Research teams used a volkswagen as a test vehicle.

  • @edilgin622
    @edilgin622 2 роки тому

    wow really amazing work!