3D city model, which consisted of the 3D building models and their geospatial position and orientation, is becoming a valuable resource in virtual reality, navigation systems, civil engineering, etc. The purpose of this research is to propose the new framework to generate the 3D city model that satisfies visual and physical requirements in ground oriented simulation system. At the same time, the framework should meet the demand of the automatic creation and cost-effectiveness, which facilitates the usability of the proposed approach. To do that, I suggest the framework that leverages the mobile mapping system which automatically gathers high resolution images and supplement sensor information like position and direction of the image. And to resolve the problem from the sensor noise and a large number of the occlusions, the fusion of digital map data will be used. This paper describes the overall framework with major process and the recommended or demanded techniques for each processing step.