186
663 351

Mip-Splatting: Alias-free 3D Gaussian Splatting

4:27

HUGS: Holistic Urban 3D Scene Understanding via Gaussian Splatting

2:27

Efficient End-to-End Detection of 6-DoF Grasps for Robotic Bin Picking

1:31

PlanT: Explainable Planning Transformers via Object-Level Representations

2:48

Constraining 3D Fields for Reconstruction and View Synthesis

27:06

Learning Robust Policies for Self-Driving

29:44

GraphDreamer: Compositional 3D Scene Synthesis from Scene Graphs

As pretrained text-to-image diffusion models become increasingly powerful, recent efforts have been made to distill knowledge from these text-to-image pretrained models for optimizing a text-guided 3D model. Most of the existing methods generate a holistic 3D model from a plain text input. This can be problematic when the text describes a complex scene with multiple objects, because the vectorized text embeddings are inherently unable to capture a complex description with multiple entities and relationships. Holistic 3D modeling of the entire scene further prevents accurate grounding of text entities and concepts. To address this limitation, we propose GraphDreamer, a novel framework to generate compositional 3D scenes from scene graphs, where objects are represented as nodes and their interactions as edges. By exploiting node and edge information in scene graphs, our method makes better use of the pretrained text-to-image diffusion model and is able to fully disentangle different objects without image-level supervision. To facilitate modeling of object-wise relationships, we use signed distance fields as representation and impose a constraint to avoid inter-penetration of objects. To avoid manual scene graph creation, we design a text prompt for ChatGPT to generate scene graphs based on text inputs. We conduct both qualitative and quantitative experiments to validate the effectiveness of GraphDreamer in generating high-fidelity compositional 3D scenes with disentangled object entities.
graphdreamer.github.io/

Відео

Mip-Splatting: Alias-free 3D Gaussian Splatting

4:27

Mip-Splatting: Alias-free 3D Gaussian Splatting

Переглядів 1,1 тис.21 день тому

Recently, 3D Gaussian Splatting has demonstrated impressive novel view synthesis results, reaching high fidelity and efficiency. However, strong artifacts can be observed when changing the sampling rate, e.g., by changing focal length or camera distance. We find that the source for this phenomenon can be attributed to the lack of 3D frequency constraints and the usage of a 2D dilation filter. T...

HUGS: Holistic Urban 3D Scene Understanding via Gaussian Splatting

2:27

HUGS: Holistic Urban 3D Scene Understanding via Gaussian Splatting

Переглядів 5122 місяці тому

Holistic understanding of urban scenes based on RGB images is a challenging yet important problem. It encompasses understanding both the geometry and appearance to enable novel view synthesis, parsing semantic labels, and tracking moving objects. Despite considerable progress, existing approaches often focus on specific aspects of this task and require additional inputs such as LiDAR scans or m...

Efficient End-to-End Detection of 6-DoF Grasps for Robotic Bin Picking

1:31

Efficient End-to-End Detection of 6-DoF Grasps for Robotic Bin Picking

Переглядів 2672 місяці тому

Bin picking is an important building block for many robotic systems, in logistics, production or in household use-cases. In recent years, machine learning methods for the prediction of 6-DoF grasps on diverse and unknown objects have shown promising progress. However, existing approaches only consider a single ground truth grasp orientation at a grasp location during training and therefore can ...

PlanT: Explainable Planning Transformers via Object-Level Representations

2:48

PlanT: Explainable Planning Transformers via Object-Level Representations

Переглядів 1,5 тис.Рік тому

Planning an optimal route in a complex environment requires efficient reasoning about the surrounding scene. While human drivers prioritize important objects and ignore details not relevant to the decision, learning-based planners typically extract features from dense, high-dimensional grid representations of the scene containing all vehicle and road context information. In this paper, we propo...

Constraining 3D Fields for Reconstruction and View Synthesis

27:06

Constraining 3D Fields for Reconstruction and View Synthesis

Переглядів 2,8 тис.Рік тому

Talk at the ECCV 2022 workshop: "NGR-CO3D: Neural Geometry and Rendering: Advances and the Common Objects in 3D Challenge" ngr-co3d.github.io/ In this talk, I present: 1) RegNeRF: m-niemeyer.github.io/regnerf 2) MonoSDF: niujinshuchong.github.io/monosdf 3) TensoRF: apchenstu.github.io/TensoRF

Learning Robust Policies for Self-Driving

29:44

Learning Robust Policies for Self-Driving

Переглядів 1 тис.Рік тому

Talk at the ECCV 2022 workshop: "AVVision: Autonomous Vehicle Vision Workshop" avvision.xyz/eccv22/ In this talk, I present: 1) TransFuser: github.com/autonomousvision/transfuser 2) PlanT: www.katrinrenz.de/plant 3) KING: lasnik.github.io/king

21:37

Generating Images and 3D Shapes

Переглядів 985Рік тому

Talk at the ECCV 2022 workshop: "Learning to Generate 3D Shapes and Scenes Workshop" learn3dg.github.io/ In this talk, I present: 1) StyleGAN-XL: sites.google.com/view/stylegan-xl 2) VoxGRAF: katjaschwarz.github.io/voxgraf 3) gDNA: xuchen-ethz.github.io/gdna

ARAH: Animatable Volume Rendering of Articulated Human SDFs

4:55

ARAH: Animatable Volume Rendering of Articulated Human SDFs

Переглядів 689Рік тому

Combining human body models with differentiable rendering has recently enabled animatable avatars of clothed humans from sparse sets of multi-view RGB videos. While state-of-the-art approaches achieve a realistic appearance with neural radiance fields (NeRF), the inferred geometry often lacks detail due to missing geometric constraints. Further, animating avatars in out-of-distribution poses is...

KING: Generating Safety-Critical Driving Scenarios for Robust Imitation via Kinematics Gradients

4:41

KING: Generating Safety-Critical Driving Scenarios for Robust Imitation via Kinematics Gradients

Переглядів 825Рік тому

Simulators offer the possibility of safe, low-cost development of self-driving systems. However, current driving simulators exhibit naïve behavior models for background traffic. Hand-tuned scenarios are typically added during simulation to induce safety-critical situations. An alternative approach is to adversarially perturb the background traffic trajectories. In this paper, we study this appr...

3:44

KING: Generating Safety-Critical Driving Scenarios for Robust Imitation via Kinematics Gradients

Переглядів 835Рік тому

6:43

ARAH: Animatable Volume Rendering of Articulated Human SDFs

Переглядів 372Рік тому

gDNA: Towards Generative Detailed Neural Avatars

4:39

gDNA: Towards Generative Detailed Neural Avatars

Переглядів 7042 роки тому

To make 3D human avatars widely available, we must be able to generate a variety of 3D virtual humans with varied identities and shapes in arbitrary poses. This task is challenging due to the diversity of clothed body shapes, their complex articulations, and the resulting rich, yet stochastic geometric detail in clothing. Hence, current methods that represent 3D people do not provide a full gen...

On the Frequency Bias of Generative Models

10:36

On the Frequency Bias of Generative Models

Переглядів 1,4 тис.2 роки тому

The key objective of Generative Adversarial Networks (GANs) is to generate new data with the same statistics as the provided training data. However, multiple recent works show that state-of-the-art architectures yet struggle to achieve this goal. In particular, they report an elevated amount of high frequencies in the spectral statistics which makes it straightforward to distinguish real and ge...

MetaAvatar: Learning Animatable Clothed Human Models from Few Depth Images

9:13

MetaAvatar: Learning Animatable Clothed Human Models from Few Depth Images

Переглядів 6722 роки тому

In this paper, we aim to create generalizable and controllable neural signed distance fields (SDFs) that represent clothed humans from monocular depth observations. Recent advances in deep learning, especially neural implicit representations, have enabled human shape reconstruction and controllable avatar generation from different sensor inputs. However, to generate realistic cloth deformations...

ATISS: Autoregressive Transformers for Indoor Scene Synthesis

14:51

ATISS: Autoregressive Transformers for Indoor Scene Synthesis

Переглядів 3,5 тис.2 роки тому

ATISS: Autoregressive Transformers for Indoor Scene Synthesis

CAMPARI: Camera-Aware Decomposed Generative Neural Radiance Fields

6:00

CAMPARI: Camera-Aware Decomposed Generative Neural Radiance Fields

Переглядів 5312 роки тому

CAMPARI: Camera-Aware Decomposed Generative Neural Radiance Fields

Shape As Points: A Differentiable Poisson Solver

12:38

Shape As Points: A Differentiable Poisson Solver

Переглядів 4,9 тис.2 роки тому

Shape As Points: A Differentiable Poisson Solver

3:05

CAMPARI: Camera-Aware Decomposed Generative Neural Radiance Fields

Переглядів 3102 роки тому

CAMPARI: Camera-Aware Decomposed Generative Neural Radiance Fields

32:33

Driving with Attention

Переглядів 1,1 тис.2 роки тому

Driving with Attention

28:29

Towards Animatable Human Avatars

Переглядів 9802 роки тому

Towards Animatable Human Avatars

STEP: Segmenting and Tracking Every Pixel

33:57

STEP: Segmenting and Tracking Every Pixel

Переглядів 1,9 тис.2 роки тому

STEP: Segmenting and Tracking Every Pixel

Generative Neural Scene Representationsfor 3D-Aware Image Synthesis

25:38

Generative Neural Scene Representationsfor 3D-Aware Image Synthesis

Переглядів 2,9 тис.2 роки тому

Generative Neural Scene Representationsfor 3D-Aware Image Synthesis

KITTI-360: A Novel Dataset and Benchmarks for Urban Scene Understanding in 2D and 3D

23:28

KITTI-360: A Novel Dataset and Benchmarks for Urban Scene Understanding in 2D and 3D

Переглядів 4 тис.2 роки тому

KITTI-360: A Novel Dataset and Benchmarks for Urban Scene Understanding in 2D and 3D

UNISURF: Unifying Neural Implicit Surfaces and Radiance Fields for Multi-View Reconstruction

11:15

UNISURF: Unifying Neural Implicit Surfaces and Radiance Fields for Multi-View Reconstruction

Переглядів 3,6 тис.2 роки тому

UNISURF: Unifying Neural Implicit Surfaces and Radiance Fields for Multi-View Reconstruction

SLIM: Self-Supervised LiDAR Scene Flow and Motion Segmentation

11:59

SLIM: Self-Supervised LiDAR Scene Flow and Motion Segmentation

Переглядів 2 тис.2 роки тому

SLIM: Self-Supervised LiDAR Scene Flow and Motion Segmentation

SNARF: Differentiable Forward Skinning for Animating Non-Rigid Neural Implicit Shapes

5:00

SNARF: Differentiable Forward Skinning for Animating Non-Rigid Neural Implicit Shapes

Переглядів 7282 роки тому

SNARF: Differentiable Forward Skinning for Animating Non-Rigid Neural Implicit Shapes

NEAT: Neural Attention Fields for End-to-End Autonomous Driving

5:01

NEAT: Neural Attention Fields for End-to-End Autonomous Driving

Переглядів 1,4 тис.2 роки тому

NEAT: Neural Attention Fields for End-to-End Autonomous Driving

KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs

3:41

KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs

Переглядів 6 тис.2 роки тому

KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs

2:00

SLIM: Self-Supervised LiDAR Scene Flow and Motion Segmentation

Переглядів 1,2 тис.2 роки тому

SLIM: Self-Supervised LiDAR Scene Flow and Motion Segmentation

КОМЕНТАРІ

@tienquangpham9316 Місяць тому
As a self-taught machine learning enthusiast who still struggles with ML accuracy in waste classification projects, I find Mr. Quoc Le's work on the "noisy student" technique to be very helpful. Thank you!"
@leslietetteh7292 2 місяці тому
Hey Andreas. I'm having a play around with diffusion models, in the interest of seeing whether real time image-to-image inference of polygon renderings to photorealistic scenes is possible. Since you specialise in automating the reverse, photorealism to 3d polygon renderings, I wonder if it would be possible for you to create a dataset that might help with this process?
@quake3video 2 місяці тому
saves me doing this now I have discovered someone else already did it lol.
@Kamran_ai 3 місяці тому
Great work, but how to visualize the resultant gif of any input image and its generated mesh file?
@blakeedwards3582 4 місяці тому
Thank you!
@akshatdobhal5653 6 місяців тому
Great Conference!
@kristoferkrus 9 місяців тому
Nice video and great results! However, is "convolutional" part of the name because the method uses convolutions somehow, or simply because it uses a grid (or several grids), just like a CNN?
@graham8316 10 місяців тому
What do the colors mean in optical flow? Is it a vector to color mapping?
@VinodKumar-rj3kt 10 місяців тому
Cool approach.
@realolivertwisted 11 місяців тому
I watched a lot of his lectures before he was banned. Any chance you know where he ended up? Haven’t checked alt platforms yet but I’m going to. (Found this video by Binging “Michael Black ideological subversion” and scrolling WAAAY down in the videos.) He didn’t (mysteriously/suspiciously) die like Yuri B., did he? 😬🙏🏼
@seantan4702 11 місяців тому
impressive
@yanzhanchen9210 Рік тому
It is so cool!
@LilGnomeGames Рік тому
I would love to see where this project ended up or where it is today.
@chi-yaohuang3257 Рік тому
Great work!
@muhammadichsan914 Рік тому
I have an own point cloud file (ply format), so how to inference it to mesh? since the demo, it has pointcloud.npz instead of ply format.
@warzeo8869 Рік тому
bang beli dimana bang
@vornamenachname906 Рік тому
yeah its called overfitting ...
@BlakeEdwards333 Рік тому
What labeling tool did you use for the 3D object tracklets?
@rohitdhankar360 Рік тому
16:40 - Avoid THRESHOLDING .
@zahidulislam7189 Рік тому
Thank you. Inspiring works.
@zahidulislam7189 Рік тому
Super work. Learned many things. Thanks for the presentation.
@nhonth2011 Рік тому
Great video! Thank you.
@yongpengchang7238 Рік тому
Hi Andreas, I'm interested in RVC 2022 talks and I wonder if you upload the RVC 2022 video here, thank you!
@joederksen7441 Рік тому
Hi Andreas, thanks for the video, I was wondering in the TensoRF model if you can easily compute the gradient of the density function at any point, (i.e. the surface normal vector)? With a NeRF I can compute the gradient of sigma using autodiff on the neural network. How does this translate to TensoRF ?
@cvlibs Рік тому
This should also be possible, autodiff then basically differentiates through the (bi-/trilinear) interpolation function.
@thetomer9786 Рік тому
What algorithm did you use for that?
@jakobpcoder Рік тому
Wirds dieses Semester noch neue Vorlesungen im Bereich Computer Vision, Maschine Learning etc. geben? Die aus dem letzen Jahr hab ich alle schon geschaut, freiwillig, komme aus Hamburg.
@jakobpcoder Рік тому
Großartig! Sehr cool, dass der Code verfügbar ist, obwohl jemand von Mercedes die Dinger im Spiel hatte :D
@ahmetfurkanaknc8959 Рік тому
Thank you, excellent explanation.
@ramakrishnamutalikdesai3491 Рік тому
can u share the code plz
@sanje1285 Рік тому
Thank you for sharing your great presentation. I have a question. At 25:51 , on the right side of the figure, I can not understand the meaning of the black & white dots on the vector flow and semantic BEV. I think the black dots mean the points (x,y) located in psedo random position on the BEV to generate the vector from the position to the target waypoint. I would like you to answer about my question to make it clear.
@LiChengqi Рік тому
👍
@bernhard_jaeger Рік тому
Nice video presentation.
@8eck Рік тому
Just wow! This is awesome! Thank you for your hard work. This is helping people like me, to do self study and improve my skills on computer vision.
@4dvovo Рік тому
Thank you Johannes Schönberger for this fantastic Open source SFM pipeline software !
@hyeon-jinlee8716 2 роки тому
This presentation is amazing! It is very beneficial to understand from previous work (NeRF) to recent work (Giraffe), including current work (GRAF). And awesome work. Thank you very much!
@zosters9407 2 роки тому
Can you please share the code for semantic segmentation on KITTI 360 dataset
@huytruong31127 2 роки тому
I need the way how to make 3D picture from 2D picture? Can you make a video for this?
@rabailrana9916 2 роки тому
please sir let me know where the complete dataset is.
@nirbhay_raghav 2 роки тому
I can't believe this was 11 years ago!! Imagine the tech the CIA has to all of these. Drive a car around and create such detailed maps.
@MarcusVinicius-lq3fe 2 роки тому
great!
@nirbhay_raghav 2 роки тому
How do they compare with implict representations using sinusoidal activation?
@user-oj4hr5rh6i 2 роки тому
Good idea to use some active contour concepts
@lesleyHsieh 2 роки тому
Fantastic research! The presentation is very clear. Thank you very much for the sharing!
@blakeedwards3582 2 роки тому
thank you
@user-im8gv6eh2y 2 роки тому
fantastic
@vertex.shader 2 роки тому
are you just raymarching at 7:23 ?
@LL-rn8rn 2 роки тому
How does it compare to detectron2's segmentation ?
@aeaxao9973 2 роки тому
Great job!
@MeKaashu 2 роки тому
1:05 ironic how Toyota Research teams used a volkswagen as a test vehicle.
@edilgin622 2 роки тому
wow really amazing work!

Andreas Geiger

КОМЕНТАРІ