Quick Search:       Advanced Search
YANG Yanbo,LIU Bin,QI Mingyue.Review of information visualization[J].Journal of Hebei University of Science and Technology,2014,35(1):91-102
信息可视化研究综述
Review of information visualization
Received:October 16, 2013  Revised:November 20, 2013
DOI:10.7535/hbkd.2014yx01016
中文关键词:  信息可视化  可视化技术  人机交互  数据挖掘
英文关键词:information visualization  visualization technology  human-machine interaction  data mining
基金项目:国家自然科学基金(71271076)
Author NameAffiliation
YANG Yanbo School of Economics and Management, Hebei University of Science and Technology 
LIU Bin School of Economics and Management, Hebei University of Science and Technology 
QI Mingyue Communication Station of Hebei Provincial Military Command 
Hits: 3611
Download times: 939
中文摘要:
      信息可视化是可视化技术在非空间数据领域的应用,可以增强数据呈现效果,让用户以直观交互的方式实现对数据的观察和浏览,从而发现数据中隐藏的特征、关系和模式。可视化应用非常广泛,主要涉及领域:数据挖掘可视化、网络数据可视化、社交可视化、交通可视化、文本可视化、生物医药可视化等等。根据CARD可视化模型可以将信息可视化的过程分为以下几个阶段:数据预处理;绘制;显示和交互。根据SHNEIDERMAN的分类,信息可视化的数据分为以下几类:一维数据、二维数据、三维数据、多维数据、时态数据、层次数据和网络数据。其中针对后4种数据的可视化是当前研究的热点。多维数据可视化方法主要包括基于几何的方法、图标方法和动画方法等。基于几何的可视化方式中最经典的就是“平行坐标系”方法。平行坐标系(parallel coordinates)使用平行的竖直轴线来代表维度,通过在轴上刻划多维数据的数值并用折线相连某一数据项在所有轴上的坐标点展示多维数据。平行坐标系方法能够简洁、快速地展示多维数据,发展出很多改进技术。但是当数据集的规模变得非常大时,密集的折线会引起“视觉混淆”(visual clutter),处理方法包括维度重排、交互方法、聚类、过滤、动画等。其他基于几何的方法包括Radviz方法使用圆形坐标系展示可视化结果;散点图矩阵(scatter plot matrix)将多维数据中的各个维度两两组合绘制成一系列的按规律排列的散点图。基于图标的可视化方法用具备可视特征的几何形状如大小、长度、形状、颜色等刻划数据,代表性的方法包括星绘法和Chernoff 面法等。动画方法用于可视化中可被用来提高交互性和理解程度,其缺点包括可能分散注意力、引起用户的误解、产生“图表垃圾”等。时间序列数据是指具有时间属性的数据集,针对时间序列数据的可视化方法如下:线形图、堆积图、动画、地平线图、时间线。层次数据具有等级或层级关系。层次数据的可视化方法主要包括节点链接图和树图2种方式。其中树图(treemap)由一系列的嵌套环、块来展示层次数据。为了能展示更多的节点内容,一些基于“焦点+上下文”技术的交互方法被开发出来。包括“鱼眼”技术、几何变形、语义缩放、远离焦点的节点聚类技术等。网络数据具有网状结构。自动布局算法是网络数据可视化的核心,目前主要有以下3类:一是力导向布局(force-directed layout);二是分层布局(hierarchical layout);三是网格布局(grid layout)。当数据节点的连接很多时,容易产生边交叉现象,导致视觉混淆。解决边交叉现象的集束边(edge bundle)技术可以分为以下几类:力导向的集束边技术、层次集束边技术、基于几何的边聚类技术、多层凝聚集束边技术和基于网格的方法等。其他研究热点包括图形的视觉因素研究、自适应可视化研究、可视化效果的评估等。视觉因素对于可视化效果的影响,如位置、长度、面积、形状、色彩等影响已经引起很多研究者的注意。色彩是视觉因素的重要组成部分,研究主要集中在颜色选择的原则和交互系统中。这些原则基于数据类型、类的数量、认知约束等。自适应可视化可以提高信息可视化的适应性。研究成果分为以下几类:自适应可视化展示、自适应资源模型、自适应用户模型。自适应可视化展示是指根据用户的特征自动为用户提供多种展示类型,自动选择可视化内容及布局的形式,自动调整可视化的元素等。自适应资源模型反映了对硬件和软件的利用以提高可视化性能。自适应用户模型通过显示用户模型的内容并让用户能够编辑,从而让用户能够控制模型的内容。当前关于信息可视化评价的研究较少,少量研究也没有提出直接和通用的可视化的评估方式,需要对信息可视化评价的理论基础、方法和应用做深入的研究。可视化技术与应用还应该继续向以下4个方面努力:直观化、关联化、艺术化、交互化。信息可视化技术的发展方向是协同(collaboration)、分析过程(analytics)、计算(computational)和意会(sense-making)。未来研究方向可以包括以下几个内容。信息可视化和数据挖掘的紧密结合。为提高处理海量数据时的速度和效率和解决视觉混淆现象;必须运用数据挖掘的公式和算法,对数据分析的过程及结果进行可视化展现。协同可视化。协同可视化领域的研究方向可以包括可视化接口设计、基于Web的可视化协同平台开发、协同可视化工作的视图设计、协同可视化中的工作流管理及协同可视化技术的应用等。更多领域的应用技术开发。包括统计可视化:需要研究使用几何、动画、图像等工具对数据统计的过程和结果进行加工和处理的技术;新闻可视化:对新闻内容进行抓取、清洗和提取和可视化展示;社交网络可视化:可视化方式显示社交网络的数据,对社交网络中节点、关系及时空数据的集成展示。搜索日志可视化:针对在使用搜索引擎时产生的海量搜索日志,可视化的展现用户的搜索行为、关系和模式等。
英文摘要:
      Information visualization is the application of visualization technology in non-spatial data area, enhancing data presentation effect. Users can observe the data intuitively and interactively so as to find implicit features, relations and patterns in data. The application of information visualization is very abroad which includes data mining visualization, network data visualization, social data visualization, traffic visualization, text visualization, and medicine visualization, etc. According to Card model on information visualization, process of information visualization includes three stages: data pretreating, data plotting, data displaying and interacting. Ben Shneiderman notes that visualization data includes one-dimensional data, two-dimensional data, three-dimensional data, multi-dimensional data, temporal data, hierarchical data, and network data, of which are given much attention to research. Visualization methods of multi-dimensional data include geometry methods, icon methods, and animation methods, etc. Among the geometry-based visualization methods, the most classic one is the parallel coordinates approach. It uses parallel vertical axis to represent the dimension values. By the multidimensional data portrayed on the shaft, and by the coordinate point connected with a line to a data entry on the axes, the multidimensional data was presented. Multi-dimensional data was displayed concisely and quickly in Parallel Coordinates, and improved many techniques. When scale of data set was very large, the dense lines could cause visual clutter. The methods of clutter reduction include dimension reordering, interacting, clustering and filtering, and visual enhancement, etc. Other methods based on geometry, including Radviz(Radial Coordinate visualization), display multi-dimensional data by circular coordinate. Scatter plot matrix arranges every demensions of multidimensional data to be combined into pairwise mode, drawing a series of regular scatters. Icon was used to describe the multi-dimensional data by its geometrical features including size, length, form and color, etc. Icon methods include star graph and Chernoff face method. Animation used for visualization can improve the degree of interacting and understanding. , but with shortcomings such as: distraction, misunderstanding and visual clutter. Time serial data refers to data sets with time property. The visualization methods include line chart, stock chart, animation, horizon graph and Timeline. Hierarchical data can be used to describe object whose attributes are rank and level. Its visualization methods include linking point graph and tree map. Tree map displays hierarchical data by nesting hoop and lump. For displaying more content, based on "Focus+Content" technology, some methods were put forward including "fish eyes" technology, geometry deformation, Semantic zooming and clustering. Network data has network structure. Layout algorithm is the core of visualization of network data, which includes three classes: Force-Directed Layout, Hierarchical Layout and Grid Layout. When there're many data connection nodes, edge corssover phenomenon happens, causing visual confusion. There were a variety of techniques for resolving the edge bundling, including hierarchical edge bundling, force-directed edge bundling, geometry-based edge clustering, multi-level agglomerative edge bundling, and grid-based methods. Other research hotspots include research on visual feature, adaptive visualization and evaluation of information visualization. Effect of visual feature such as position, length, area, shape and color, etc. on visual result has received considerable attention. Color is one of most important visual factor, so research focuses on the color selection principle and interaction system, which are based on data type, quantity, and cognitive constraints. Adaptive visualization can enhance adaptability of information visualization, which includes adaptive display, adaptive resource model, and adaptive user model according to research of Domik & Gutkauf and Grawemeyer & Cox. Adaptive display provides automatic and suitable display for different users, including selecting content and layout, adjusting visual features automatically. Adaptive resource model means utilizing hardware and software to enhance visual performance. Adaptive user model means displaying user model in order to edit and control content. Morse et al. notes that the research on evaluation of information visualizations is rare. Evaluation on direct and general information visualization was not involved in some research. So, it is needed to do deep research on the theoretical basis, method and application of information visualization evaluation. Technology and application of information visualization should be developed in four aspects, displaying data directly perceived through the senses, mining and showing relation between data, strengthening demonstration of aesthetics and artistry, enhancing performance of interaction and operation on real-time data. Dai et al. noted that research direction of information visualization was Collaboration, Analytics, Computational and Sense-making. Research directions in future is as following. Visualization and data mining: to promote efficiency and avoid visual clutter in processing huge data, information visualization should be combined with data mining so that user can operate huge data and discover implicit information. Collaborative visualization: Collaborative visualization includes interface design, collaborative platform based on web, view design, workflow design, and application of technology. Application in more fields: statistics visualization refers to processing and handling the statistical process data and results by method of geometry, animation, and graph ett. News visualization refers to presenting diversely analysis results after grasping, cleaning, and drawing news corpus. Social network visualization refers to displaying and revealing relation, comparison, and trend of social network through integration of dimensions of time and space. Search log visualization refers to displaying huge searching behavior when using a search engine. Users' search behavior, relationships and patterns are presented visually.
View Full Text  View/Add Comment  Download reader
Close