Vladlen Koltun
Vladlen Koltun is an Israeli-American computer scientist and intelligent systems researcher, specializing in robotics, autonomous driving, computer vision, computer graphics and machine learning.[7][8][2][9] Koltun's contributions in research and publications are in the areas of convolutional neural networks, reality simulation, view synthesis,[9] photorealistic rendering, urban self-driving cars simulation, 3D computer graphics, robot locomotion[2] and drones maneuverability in dynamic environments.[10][1][11][12][13][14] He currently serves as distinguished scientist at Apple Inc.[1]
Vladlen Koltun | |
---|---|
Born | 1980 |
Nationality | Israeli-American |
Alma mater |
|
Known for |
|
Awards |
|
Scientific career | |
Fields |
|
Institutions | |
Thesis | Arrangements in four dimensions and related structures (2002) |
Doctoral advisor | Micha Sharir |
Other academic advisors | Christos Papadimitriou |
Website | vladlen |
Early life and education
Vladlen Koltun was born on November 21, 1980 in Kiev, Ukraine[15][16] and grew up in Israel, where he completed his B.S. degree in computer science magna cum laude from Tel Aviv University in 2000. Continuing his studies at Tel Aviv University, he finished his Ph.D. with distinction in computer science in 2002 at the age of 21[16] with a thesis, "Arrangements in four dimensions and related structures"; his doctoral adviser was Micha Sharir.[17] He then completed his postdoctoral fellowship under the supervision of Christos Papadimitriou at the University of California, Berkeley, where he conducted research in theoretical computer science in 2002-2005.[13]
Career
Koltun served as an assistant professor at Stanford University from 2005 - 2013, where he lectured in the areas of computer science, computer graphics and geometric algorithms.[7][18] His scientific research interests at Stanford included computer vision, neural networks, data-driven 3D modeling and machine learning (dense random fields), among others.[1] During his tenure at Stanford, he also supervised PhD students and postdoctoral researchers;[17] two of his students now serve as professors at UC Berkeley (Sergey Levine)[19] and UT Austin (Philipp Krähenbühl).[20] While at Stanford, Koltun was a recipient of the Sloan Research Fellowship[3] and the National Science Foundation's Career Award.[4]
Koltun's research at Stanford[7] contributed to the development of data-driven 3D modeling technology[21] in close collaboration with Siddhartha Chaudhuri.[22] Chaudhuri's work along with Vladlen Koltun, Evangelos Kalogerakis and Leonidas Guibas resulted in a SIGGRAPH publication in 2011.[23] As a result, Mixamo licensed the technology from Stanford and later Adobe Inc. acquired Mixamo and further developed Adobe Fuse CC, a 3D computer graphics software that enabled users to create 3D characters.[23][24][21][22] In 2014, Koltun was hired by Adobe to conduct research in visual computing with the primary focus on three-dimensional reconstruction.
Koltun left Adobe to join Intel, where he served in various positions until 2021 for the company's R&D projects, including Principal Researcher,[25] Senior Principal Researcher and Chief Scientist for Intelligent Systems.[10][6] According to his profile on the Institute of Electrical and Electronics Engineers' website, "Koltun's lab conducted high-impact basic research on intelligent systems, with emphasis on computer vision, robotics, and machine learning."[10] Since August 2021, Koltun has been serving as distinguished scientist at Apple Inc.[1]
Research
While at Intel, Koltun was instrumental in developing virtual reality simulators for urban autonomous driving, robots and drones. In particular, he directed research on deep reinforcement learning techniques by applying neural networks in virtual simulators. In the research, the neural networks first learnt to make decisions by trial and error while exploring physics in virtual reality. Once the intelligent systems had been trained, they were transferred to robots or drones to be used in natural environments. For instance, the method was applied to the ANYmal robot, a sophisticated quadrupedal machine with proprioceptive feedback in locomotion control.[12][26]
Another example was development of intelligent systems for drones flying in dynamic environments based on the research papers co-published with University of Zurich robotics professor Davide Scaramuzza: Deep Drone Racing: Learning Agile Flight in Dynamic Environments (2018),[27] Deep Drone Racing: From Simulation to Reality With Domain Randomization (2019)[28] and Deep Drone Acrobatics (2020).[29] In this collaborative project of the Intel Labs with the University of Zurich, neural networks were trained in simulation and then deployed on physical quadrotors, in some cases showing maneuverability and speed similar to those controlled manually by human pilots.[28][30][31]
The studies in the domain of urban autonomous driving led Koltun's group[32] to the development of the CARLA project (short for Car Learning to Act), an open-source simulator,[33] powered by Unreal Engine,[34] that can be used to test self-driving technologies in realistic driving environments with random dangerous situations occurring on the roads.[35][36][37] The project has been funded by the Intel Labs and Toyota Research Institute, among others.[38][36] During his activity in Intel Labs, Koltun also contributed to further development of 3D computer graphics in the fields of 3D photorealistic view synthesis and photorealistic rendering.[39][9][40] In their work, Enhancing Photorealism Enhancement,[41][42][43] Vladlen Koltun, Stephan R. Richter and Hassan Abu AlHaija used a view synthesis technique by sourcing information from the Cityscapes Dataset (See Section 1.3 - Objects detection and recognition in List of datasets for machine-learning research),[39][44] a collection of stereo video sequences captured by a car’s built-in camera for machine-learning algorithms. As VentureBeat explains: "The full system, displayed below, is composed of several interconnected neural networks. The G-buffer encoder transforms different render maps (G-buffers) into a set of numerical features. G-buffers are maps for surface normal information, depth, albedo, glossiness, atmosphere, and object segmentation. The neural network uses convolution layers to process this information and output a vector of 128 features that improve the performance of the image enhancement network and avoid artifacts that other similar techniques produce. The G-buffers are obtained directly from the game engine. The image enhancement network takes as input the game’s rendered frame and the features from the G-buffer encoder and generates the photorealistic version of the image. The remaining components, the discriminator and the LPIPS loss function, are used during training. They grade the output of the enhancement network by evaluating its consistency with the original game-rendered frame and by comparing its photorealistic quality with real images."[45] As a result, the researchers achieved substantial gains in stability and realism in comparison to alternative image-to-image translation methods and a variety of other baselines.[46] Andrew Liszewski from Gizmodo notes: "Based on dataset, the neural network also uses other rendered data the game’s engine has access to, like the depth of objects in a scene, and information about how the lighting is being processed and rendered."[47] The photorealism enhancement system was tested in the Grand Theft Auto V game and gained the attention of technology experts.[48][49][50][39]
Inspired by Google Cardboard,[2] Vladlen Koltun and Matthias Müller, a German scientist specializing in aerial tracking and sim-to-real transfer for autonomous navigation, developed OpenBot, a software stack for turning Android smart phones into autonomous four-wheeled robots capable of navigating in space and following objects of interest, also avoiding various obstacles. The robot’s parts include a 3D-printable chassis with notches for a controller, microcontroller, LEDs, a smartphone mount, and a USB cable.[51] The software consists of the Arduino Nano board, which bridges the smartphone with the motor actuation tasks and batteries, and an Android app, which is responsible for integration of data.[2][52] The project was released as an open-source concept[53][54] for various robotics-related applications with the software development kit available on GitHub.[55][56]
Koltun has been critical of the reliability of the h-index, drawing attention to the inflation of h-index values since multiple co-authorships became popular in scientific communities.[57] In the collaborative work with David Hafner, The h-index is no longer an effective correlate of scientific reputation, Koltun and Hafner state that " hyper-authorship – a growing phenomenon where global research consortia produce papers with thousands of co-authors – enables people to rack up enormous h-indices very quickly."[8][58][59]
Selected works
- SR Richter, HA AlHaija, V Koltun; Enhancing Photorealism Enhancement, arXiv:2105.04619 (2021)
- Joonho Lee, Jemin Hwangbo, Lorenz Wellhausen, Vladlen Koltun, Marco Hutter; Learning Quadrupedal Locomotion over Challenging Terrain, Science Robotics (2020)[12]
- Elia Kaufmann, Antonio Loquercio, René Ranftl, Matthias Müller, Vladlen Koltun, Davide Scaramuzza; Deep Drone Acrobatics, Robotics: Science and Systems (2020)[60]
- Manolis Savva, Abhishek Kadian, Oleksandr Maksymets, Yili Zhao, Erik Wijmans, Bhavana Jain, Julian Straub, Jia Liu, Vladlen Koltun, Jitendra Malik, Devi Parikh, Dhruv Batra; Habitat: A Platform for Embodied AI Research, International Conference on Computer Vision (2019)[61]
- Chen Chen, Qifeng Chen, Jia Xu, Vladlen Koltun, Learning to See in the Dark, Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, June 2018
- Shaojie Bai, J Zico Kolter, Vladlen Koltun; An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling, arXiv preprint arXiv:1803.01271 (2018)
- Qian-Yi Zhou, Jaesik Park, and Vladlen Koltun, Open3D: A Modern Library for 3D Data Processing, Technical Report, arXiv:1801.09847, 2018.
- Alexey Dosovitskiy, German Ros, Felipe Codevilla, Antonio López, Vladlen Koltun; CARLA: An Open Urban Driving Simulator, Conference on Robot Learning (CoRL) 2017
- F Yu, V Koltun; Multi-Scale Context Aggregation by Dilated Convolutions, International Conference on Learning Representations (ICLR) 2016
- Stephan R Richter, Vibhav Vineet, Stefan Roth, Vladlen Koltun; Playing for Data: Ground Truth from Computer Games, European Conference on Computer Vision (ECCV) 2016
- Sergey Levine, Vladlen Koltun; Guided Policy Search, International Conference on Machine Learning (ICML) 2013
- Philipp Krähenbühl, Vladlen Koltun; Efficient inference in fully connected CRFs with Gaussian edge potentials, Advances in Neural Information Processing Systems (NIPS) 2011
References
- "Vladlen Koltun, Distinguished Scientist at Apple Inc". Google Scholar.
- "How To Turn Your Smartphone Into A Robot". Discover Magazine.
- "Sloan Research Fellowship in computer science (2007)" (PDF). Alfred P. Sloan Foundation.
- "Career Award: Fundamental Geometric Algorithms".
- "Computer Graphics as a Telecommunication Medium". Princeton University.
- "Computer Science Bibliography: Vladlen Koltun's affiliations". Schloss Dagstuhl, Leibniz Center for Informatics.
- Stober, Dan (2008-01-07). "Building a better virtual world, one tree (or millions) at a time". Stanford News.
- Durrani, Jamie (2021-07-29). "Reliability of researcher metric the h-index is in decline". Chemistry World.
- "This new tech from Intel Labs could revolutionize VR gaming". PC Games.
- "Vladlen Koltun". Institute of Electrical and Electronics Engineers.
- "Robots, hominins and superconductors: 10 remarkable papers from 2019". Nature. 576 (7787): 394–396. 2019. Bibcode:2019Natur.576..394.. doi:10.1038/d41586-019-03834-4. PMID 31844266. S2CID 209371845.
- Lee, Joonho; Hwangbo, Jemin; Wellhausen, Lorenz; Koltun, Vladlen; Hutter, Marco (2020). "Learning quadrupedal locomotion over challenging terrain". Science Robotics. 5 (47). arXiv:2010.11251. doi:10.1126/scirobotics.abc5986. hdl:20.500.11850/448343. PMID 33087482. S2CID 224828219.
- "Stanford Report, February 1, 2006". Stanford News. February 2006.
- "Vladlen Koltun: Publications & Citations Over Time". Microsoft Academic. Archived from the original on December 2, 2021.
- "Vladlen Koltun Bio" (PDF). Curriculum Vitae (Place and date of birthday confirmation).
- "Arrangements in four dimensions and related structures". The National Library of Israel.
- "Vladlen Koltun". Math Genealogy.
- "Courses at Stanford". Official website.
- "Sergey Levine". Berkley Research.
- "Philipp Krähenbühl, Assistant Professor". The University of Texas at Austin.
- Dean Takahashi (November 7, 2013). "Mixamo debuts Fuse character creation tool on Steam using Valve's Team Fortress 2 characters". Venture Beat.
- "3D modeling with data-driven suggestions". Stanford Digital Repository.
- "Probabilistic Reasoning for Assembly-Based 3D Modeling". Cornell University.
- "Adobe Fuse CC Alternatives, Similar". AlternativeBK.
- "About Vladlen Koltun".
- Lee, Joonho; Hwangbo, Jemin; Wellhausen, Lorenz; Koltun, Vladlen; Hutter, Marco (October 2020). "Learning Quadrupedal Locomotion over Challenging Terrain". Science Robotics. 5 (47). arXiv:2010.11251. doi:10.1126/scirobotics.abc5986. PMID 33087482. S2CID 224828219.
- Kaufmann, Elia; Loquercio, Antonio; Ranftl, Rene; Dosovitskiy, Alexey; Koltun, Vladlen; Scaramuzza, Davide (October 2018). "Deep Drone Racing: Learning Agile Flight in Dynamic Environments". Conference on Robot Learning (CoRL). arXiv:1806.08548.
- Loquercio, Antonio; Kaufmann, Elia; Ranftl, Rene; Dosovitskiy, Alexey; Koltun, Vladlen; Scaramuzza, Davide (October 2019). "Deep Drone Racing: From Simulation to Reality With Domain Randomization". IEEE Transactions on Robotics. 36 (1): 1–14. arXiv:1905.09727. doi:10.1109/TRO.2019.2942989. S2CID 162183971.
- Kaufmann, Elia; Loquercio, Antonio; Ranftl, René; Müller, Matthias; Koltun, Vladlen; Scaramuzza, Davide (July 2020). "Deep Drone Acrobatics". Robotics: Science and Systems. arXiv:2006.05768. doi:10.15607/RSS.2020.XVI.040. ISBN 978-0-9923747-6-1. S2CID 219559096.
- Ackerman, Evan (2020-10-07). "AI-Powered Drone Learns Extreme Acrobatics This drone can perform maneuvers that are challenging for even the best human pilot". IEEE Spectrum.
- "Drones learn acrobatics by themselves". RoboHub.
- "Self-Driving Cars in Simulated Worlds". Udacity. 11 September 2018.
- "Carla Simulator". GitHub. 23 December 2021.
- "Carla: Open-source simulator for autonomous driving research". MarkTechPost. 29 February 2020.
- "The Open-Source Driving Simulator That Trains Autonomous Vehicles". MIT Technology Review.
- "Carla Project". Official Website.
- Dosovitskiy, Alexey; Ros, German; Codevilla, Felipe; Lopez, Antonio; Koltun, Vladlen (November 2017). "CARLA: An Open Urban Driving Simulator". Conference on Robot Learning (CoRL). arXiv:1711.03938.
- "Toyota donates $100,000 for open-source self-driving simulator". CNET.
- "Achieving Photorealism With Neural Networks". Towards Data Science. 31 May 2021.
- "Free View Synthesis" (PDF). Intel Labs.
- "Machine Learning Takes GTA V Photorealism to Never-Before-Seen Levels". Interesting Engineering. 17 May 2021.
- "Enhancing Photorealism Enhancement". YouTube.
- Richter, Stephan R.; Hassan Abu AlHaija; Koltun, Vladlen (May 2021). "Enhancing Photorealism Enhancement". arXiv:2105.04619 [cs.CV].
- "The Cityscapes Dataset". The Cityscapes Dataset.
- Dickson, Ben (2021-05-31). "Intel's image-enhancing AI is a step forward for photorealistic game engines". VentureBeat.
- "Guide to Intel's Stable View Synthesis – A State-of-Art 3D Photorealistic Framework". Analytics India Magazine. 5 March 2021.
- Liszewski, Andrew (2021-05-12). "Grand Theft Auto Looks Frighteningly Photorealistic With This Machine Learning Technique". Gizmodo.
- "Grand Theft Auto V' mod adds uncanny photorealism through AI". Engadget.
- "Grand Theft Auto Gets A CNN Facelift". Analytics India Magazine. 20 May 2021.
- Carlos Campbell, Ian (2021-05-12). "Intel is using machine learning to make GTA V look incredibly realistic". The Verge.
- "Intel researchers design smartphone-powered robot that costs $50 to assemble". VentureBeat. 26 August 2020.
- Sakharkar, Ashwini (2020-08-29). "Engineers develop an open-source, smartphone-powered robot". Inceptive Mind.
- "Turning smartphones into robots". OpenBot Project.
- Müller, Matthias; Koltun, Vladlen (August 2020). "OpenBot: Turning Smartphones into Robots". arXiv:2008.10631v2 [cs.RO].
- "Intel Lab Transforms Your Phone into a Robot for $50". Synced Review. 27 August 2020.
- "OpenBot code". GitHub. 23 December 2021.
- "Evaluating researchers". Talk at the CMU AI Seminar.
- Koltun, Vladlen; Hafner, David (June 2021). "The h-index is no longer an effective correlate of scientific reputation". PLOS ONE. 16 (6): e0253397. arXiv:2102.03234. Bibcode:2021PLoSO..1653397K. doi:10.1371/journal.pone.0253397. PMC 8238192. PMID 34181681.
- Hafner, David (June 2021). "Data for "The h-index is no longer an effective correlate of scientific reputation"". Mendeley Data. 1. doi:10.17632/wsrjd8m2h6.1.
- "Paper Awards, 2020". Robotics Science and Systems.
- Savva, Manolis; Kadian, Abhishek; Maksymets, Oleksandr; Zhao, Yili; Wijmans, Erik; Jain, Bhavana; Straub, Julian; Liu, Jia; Koltun, Vladlen; Malik, Jitendra; Parikh, Devi; Batra, Dhruv (February 2020). "Habitat: A Platform for Embodied AI Research". 2019 IEEE/CVF International Conference on Computer Vision (ICCV). pp. 9338–9346. arXiv:1904.01201. doi:10.1109/ICCV.2019.00943. ISBN 978-1-7281-4803-8. S2CID 91184540.