WEBVTT 1 00:00:13.850 --> 00:00:14.110 Well 2 00:00:14.110 --> 00:00:14.770 Everybody 3 00:00:14.770 --> 00:00:15.720 At night 4 00:00:16.700 --> 00:00:20.240 Today I'm going to talk about the topic of big data's urban space analysis. 5 00:00:21.230 --> 00:00:24.250 You can see the name of a laboratory below us. 6 00:00:24.250 --> 00:00:25.700 Just big data and the test 7 00:00:25.850 --> 00:00:26.750 Spatial analysis 8 00:00:27.530 --> 00:00:33.350 So this lab is to study how to use big data in the middle of urban space. 9 00:00:34.170 --> 00:00:39.630 So today, I will also bring some of our research results to share with you. 10 00:00:42.440 --> 00:00:45.250 I think everyone has heard a lot about big data. 11 00:00:45.250 --> 00:00:46.680 So it's nothing more than big 12 00:00:47.010 --> 00:00:49.140 But the quantity changes to the qualitative change. 13 00:00:49.500 --> 00:00:52.290 So when it's big enough 14 00:00:52.290 --> 00:00:56.830 A lot of your behavior, a lot of your thinking has to change. 15 00:00:56.830 --> 00:01:02.080 So qualitative change is the key to what we call big data. 16 00:01:02.080 --> 00:01:04.290 So where is the point of the qualitative change? 17 00:01:06.040 --> 00:01:06.800 Let's have a look. 18 00:01:07.170 --> 00:01:11.000 He is too big to pass the current mainstream software tools. 19 00:01:11.000 --> 00:01:13.970 Access management within a reasonable time 20 00:01:13.970 --> 00:01:19.740 And collate into consulting and purpose consulting to help enterprises make decisions. 21 00:01:20.070 --> 00:01:24.950 That means we now know the way we know. 22 00:01:25.070 --> 00:01:28.780 Or we generally teach you the way you train people 23 00:01:28.780 --> 00:01:29.820 You're doing big data there 24 00:01:29.820 --> 00:01:30.960 You can't play at all. 25 00:01:30.960 --> 00:01:32.110 Can't do it at all. 26 00:01:33.430 --> 00:01:35.650 When you read your data, you crashed. 27 00:01:35.650 --> 00:01:37.010 How do you add the hardware again? 28 00:01:37.010 --> 00:01:39.520 Adding another software will not work. 29 00:01:39.850 --> 00:01:41.610 You just say the hardware needs to be upgraded. 30 00:01:41.910 --> 00:01:44.310 After hardware upgrade, use common software. 31 00:01:45.180 --> 00:01:46.310 Can't read it 32 00:01:46.940 --> 00:01:49.860 At this time, you have to upgrade the software. 33 00:01:49.860 --> 00:01:52.060 To learn software 34 00:01:52.300 --> 00:01:53.210 In the process of learning 35 00:01:53.210 --> 00:01:53.990 While you're working on it 36 00:01:53.990 --> 00:01:57.230 We will find that the traditional way of thinking also has to change. 37 00:01:57.770 --> 00:02:03.220 In the past, we used models to pay attention to the precision of models. 38 00:02:03.630 --> 00:02:04.990 Pursue its precision 39 00:02:04.990 --> 00:02:09.630 But you find that big data doesn't seem to need to pursue and so on. 40 00:02:09.630 --> 00:02:12.880 So what we're going to talk about here is big data, it's achieved 41 00:02:13.080 --> 00:02:13.910 Quantization to particle 42 00:02:14.090 --> 00:02:18.850 Production is our way of thinking must also change. 43 00:02:20.370 --> 00:02:22.700 So why did big data show up? 44 00:02:22.700 --> 00:02:25.240 So a lot of scholars are explaining 45 00:02:25.240 --> 00:02:27.730 So here I quoted a Qiu Zeqi 46 00:02:27.730 --> 00:02:31.300 This is a sociologist, his big data, his interpretation. 47 00:02:31.840 --> 00:02:35.620 He said that big data was the parallel of the trace data collection. 48 00:02:35.620 --> 00:02:38.550 On-line socialized life 49 00:02:39.770 --> 00:02:41.770 What is trace data? 50 00:02:43.480 --> 00:02:45.850 Changes in everyday society 51 00:02:46.350 --> 00:02:47.940 Including our behavior 52 00:02:49.510 --> 00:02:51.380 Including all kinds of events 53 00:02:52.150 --> 00:02:54.880 These changes can now be recorded. 54 00:02:56.110 --> 00:02:58.300 Because of the popularity of the Internet. 55 00:02:58.550 --> 00:03:00.390 Because of wireless Internet access 56 00:03:01.550 --> 00:03:04.840 These events, these acts are recorded. 57 00:03:05.990 --> 00:03:06.890 So where is the record 58 00:03:07.480 --> 00:03:09.530 Recording all kinds of clouds 59 00:03:10.820 --> 00:03:11.940 So there are all these data 60 00:03:11.940 --> 00:03:16.490 And it's growing fast every day. 61 00:03:17.140 --> 00:03:21.600 So we suddenly found out that there were so many 62 00:03:21.600 --> 00:03:22.690 We can't handle it. 63 00:03:24.870 --> 00:03:28.760 That's why it's called ice sheet, not garbage. 64 00:03:29.080 --> 00:03:31.700 Then it has its essential meaning. 65 00:03:32.660 --> 00:03:34.100 About big data's character 66 00:03:34.360 --> 00:03:36.710 I think this is a classic illustration. 67 00:03:36.710 --> 00:03:40.070 It was called 3 vadm, and first came up with big data. 68 00:03:40.250 --> 00:03:43.200 It defines it as three micros 69 00:03:45.750 --> 00:03:49.170 So the first one is that the number is very large. 70 00:03:49.660 --> 00:03:51.110 This quantity is large. 71 00:03:51.790 --> 00:03:53.800 We can see that its units are changing. 72 00:03:55.150 --> 00:03:56.240 Unit variation 73 00:03:56.540 --> 00:03:58.110 We used to say GP 74 00:03:58.320 --> 00:04:00.080 Now called T1 75 00:04:00.650 --> 00:04:05.520 So then down to ZD, then how does this unit change? 76 00:04:05.820 --> 00:04:06.440 Let's have a look 77 00:04:06.440 --> 00:04:10.720 This is a unit of record of the amount of data. 78 00:04:10.720 --> 00:04:13.120 Then we are familiar with the past. 79 00:04:13.200 --> 00:04:17.670 Because these units are then the 10 th of 2. 80 00:04:17.670 --> 00:04:18.370 20th power 81 00:04:18.370 --> 00:04:19.610 30 times 44 82 00:04:20.300 --> 00:04:21.680 You can always stand in line. 83 00:04:22.160 --> 00:04:22.800 Permanent duct 84 00:04:22.800 --> 00:04:30.120 I believe that soon we will face the following, we will meet ZD, and even greater. 85 00:04:30.740 --> 00:04:32.210 This is the first micro 86 00:04:32.690 --> 00:04:33.580 High pressure 87 00:04:35.100 --> 00:04:37.040 The second problem is that there are many kinds. 88 00:04:38.800 --> 00:04:39.470 Big data 89 00:04:39.470 --> 00:04:43.240 But it is not just a narrow concept of data. 90 00:04:43.530 --> 00:04:44.770 But a message 91 00:04:44.970 --> 00:04:46.910 This information includes data information. 92 00:04:47.870 --> 00:04:49.280 Including text information 93 00:04:50.180 --> 00:04:52.470 We left a message on the Internet 94 00:04:53.330 --> 00:04:55.150 All kinds of text information 95 00:04:56.130 --> 00:04:58.630 And pictures and sounds 96 00:04:59.970 --> 00:05:02.250 So these are unstructured data 97 00:05:02.250 --> 00:05:05.030 We call it structured and unstructured 98 00:05:05.030 --> 00:05:11.950 And unstructured data in big data inside very network, we say we can climb. 99 00:05:12.730 --> 00:05:14.360 To crawl the data 100 00:05:14.360 --> 00:05:16.430 Most of it is such a structure. 101 00:05:16.830 --> 00:05:20.340 So, how to deal with such a structure? 102 00:05:20.680 --> 00:05:29.960 In our traditional methods, there is no delivery for the development of a lot of text analysis of emotional analysis. 103 00:05:30.380 --> 00:05:31.560 Image analysis 104 00:05:31.710 --> 00:05:34.090 New methods also come into being. 105 00:05:34.500 --> 00:05:35.310 This is the second question. 106 00:05:35.310 --> 00:05:37.250 That is, the structure will not change. 107 00:05:38.540 --> 00:05:43.490 The third batch data to be transferred to stream data 108 00:05:43.970 --> 00:05:44.760 Stream data 109 00:05:45.290 --> 00:05:46.960 We say the data is very large. 110 00:05:48.710 --> 00:05:51.880 It does not allow us to take this data and then analyze it. 111 00:05:52.940 --> 00:05:56.720 In the past, we used to give it to me. 112 00:05:57.130 --> 00:05:58.950 So put it on us to be able to analyze 113 00:06:00.110 --> 00:06:02.370 Now the age of big data 114 00:06:02.370 --> 00:06:08.210 The challenge is that the data is being transmitted quickly. 115 00:06:08.210 --> 00:06:10.280 You have to finish your analysis. 116 00:06:12.320 --> 00:06:14.980 You've been reading in the middle of the process. 117 00:06:14.980 --> 00:06:16.860 You go back and analyze it 118 00:06:16.860 --> 00:06:20.980 You are not allowed to finish this analysis in the middle of the reading process. 119 00:06:20.980 --> 00:06:24.980 So the way we, big data, handled it was simple and simple. 120 00:06:24.980 --> 00:06:25.730 Again simple 121 00:06:25.730 --> 00:06:27.840 The simpler the method, the better. 122 00:06:29.100 --> 00:06:34.240 This is not the same as some precision analysis that we have been pursuing in statistics in the past. 123 00:06:35.250 --> 00:06:36.170 On this basis 124 00:06:36.600 --> 00:06:38.130 ABC this is a company. 125 00:06:38.250 --> 00:06:44.590 It adds a fourth nominal value density. 126 00:06:44.860 --> 00:06:48.660 Then big data's density value is very low. 127 00:06:49.700 --> 00:06:51.880 So this is not high or low 128 00:06:53.470 --> 00:06:54.960 In the middle of a huge amount of data 129 00:06:54.960 --> 00:06:56.650 For you 130 00:06:56.650 --> 00:06:58.640 Maybe it's just a couple of data in the middle 131 00:06:58.860 --> 00:07:00.080 So you have to find this data 132 00:07:00.650 --> 00:07:01.660 This is yours 133 00:07:02.230 --> 00:07:04.790 If you want yours, you'll be challenged. 134 00:07:04.790 --> 00:07:08.620 So in the middle of this low value density 135 00:07:08.620 --> 00:07:10.130 How do you find the data? 136 00:07:11.140 --> 00:07:14.780 This is a major challenge for big data's analysis. 137 00:07:16.740 --> 00:07:17.200 Well 138 00:07:17.200 --> 00:07:18.750 Go back to this picture 139 00:07:20.030 --> 00:07:21.010 Big data times 140 00:07:21.010 --> 00:07:21.860 Our information 141 00:07:21.860 --> 00:07:22.960 Our behavior 142 00:07:23.230 --> 00:07:25.880 Events will be recorded. 143 00:07:25.880 --> 00:07:27.180 So this is the ratio 144 00:07:27.180 --> 00:07:28.690 You can see this ratio 145 00:07:28.690 --> 00:07:33.830 It is the events around us that are digitized by information technology. 146 00:07:34.020 --> 00:07:35.640 Proportion of digital information 147 00:07:36.230 --> 00:07:39.310 By 2013, it was 98%. 148 00:07:39.310 --> 00:07:41.350 I don't know how to count this data. 149 00:07:41.350 --> 00:07:42.710 I quote someone else too 150 00:07:43.170 --> 00:07:48.780 In a word, it is a society in which no trace of society is recorded. 151 00:07:49.040 --> 00:07:53.360 So the proportion of information digital information is practical 152 00:07:59.600 --> 00:08:00.980 These days 153 00:08:02.720 --> 00:08:04.130 Our lives 154 00:08:05.570 --> 00:08:07.410 Including walking and cycling. 155 00:08:07.930 --> 00:08:10.330 You take a taxi or a bus. 156 00:08:10.480 --> 00:08:14.710 There are ways to record these behaviors 157 00:08:14.990 --> 00:08:19.810 So the recorded information is analyzed 158 00:08:20.000 --> 00:08:23.470 It will make people's behavior more convenient. 159 00:08:24.330 --> 00:08:26.790 This is a core content of the city of wisdom. 160 00:08:27.550 --> 00:08:29.750 We say the city of wisdom has a lot of land. 161 00:08:30.020 --> 00:08:32.620 But this definition is almost universally accepted. 162 00:08:32.620 --> 00:08:36.610 That is, big data must have done a full excavation of the society. 163 00:08:38.270 --> 00:08:42.750 The results of the excavation are back to guide the operation of society. 164 00:08:42.880 --> 00:08:45.210 Conduct of citizens 165 00:08:45.820 --> 00:08:49.920 This is a framework for the city of wisdom. 166 00:08:50.770 --> 00:08:55.420 So when big data comes, we will bring convenience. 167 00:08:58.320 --> 00:08:59.760 Faced with huge amounts of data 168 00:08:59.760 --> 00:09:01.740 What we have to do now 169 00:09:02.840 --> 00:09:09.860 The first is to visualize what this data is about. 170 00:09:11.780 --> 00:09:14.920 Why do you suddenly attach importance to visualization? 171 00:09:16.150 --> 00:09:18.270 Because big data, you don't have visual words. 172 00:09:18.270 --> 00:09:19.660 You can't go on next. 173 00:09:21.180 --> 00:09:22.450 When so much data was given to you 174 00:09:22.450 --> 00:09:23.880 Where do you come from? 175 00:09:25.310 --> 00:09:26.210 Have no idea 176 00:09:27.300 --> 00:09:30.460 You have to know a rough outline first. 177 00:09:30.460 --> 00:09:33.370 And then think about what I'm probably doing inside. 178 00:09:33.810 --> 00:09:38.090 The general outline is to turn a huge amount of data into a few graphs. 179 00:09:38.090 --> 00:09:40.380 This is a graphic visualization 180 00:09:41.400 --> 00:09:45.290 A network of mobile networks in the middle of a mobile city. 181 00:09:45.290 --> 00:09:51.440 This is a mobile phone data made by the people of Geneva, the city's trajectory. 182 00:09:52.430 --> 00:09:54.290 What do these tracks tell us? 183 00:09:54.420 --> 00:09:59.440 Tell us about these citizens from the suburbs to the cities. 184 00:10:00.050 --> 00:10:01.580 Back to the suburbs from the city 185 00:10:02.220 --> 00:10:05.770 These main destinations and destinations 186 00:10:06.000 --> 00:10:10.470 That is, the starting point and the end point in the middle of the city. 187 00:10:11.120 --> 00:10:14.180 Then it also reflects the city's one-day dynamics. 188 00:10:15.340 --> 00:10:19.350 So this kind of visualization will give us a lot of information. 189 00:10:21.740 --> 00:10:26.000 These are some young scholars from Fudan university. 190 00:10:26.000 --> 00:10:32.160 They use Shanghai mobile phone data made by a Shanghai commuter visual. 191 00:10:34.610 --> 00:10:38.100 This picture is very informative. 192 00:10:39.470 --> 00:10:41.550 You can find a lot from it. 193 00:10:41.690 --> 00:10:46.510 You can say a lot about Shanghai. 194 00:10:47.000 --> 00:10:50.860 We know the perimeter to the downtown area 195 00:10:51.400 --> 00:10:52.630 Its connection 196 00:10:54.130 --> 00:10:57.260 So the main points and the center of the city links. 197 00:10:57.260 --> 00:11:00.090 The connection between the surrounding points and these new cities 198 00:11:00.670 --> 00:11:05.530 There will also be some special links to these points. 199 00:11:06.170 --> 00:11:11.690 So this picture itself is a very good study of big data. 200 00:11:13.810 --> 00:11:18.700 It allows many people to think again on the basis of this graph. 201 00:11:18.870 --> 00:11:20.200 To do research again 202 00:11:21.670 --> 00:11:27.080 So this kind of work is in the urban space analysis of big data. 203 00:11:27.720 --> 00:11:30.490 At present, it is still a very important first. 204 00:11:32.610 --> 00:11:34.530 So there's also some visualization 205 00:11:34.530 --> 00:11:37.380 Then we'll find out in the big data era. 206 00:11:37.760 --> 00:11:40.420 There's a lot of data that we can't imagine 207 00:11:41.280 --> 00:11:42.790 Data that you couldn't even think about in the past. 208 00:11:42.950 --> 00:11:45.780 Now we can use it 209 00:11:46.490 --> 00:11:52.890 This diagram is a presentation of the results of a text analysis. 210 00:11:53.350 --> 00:12:00.870 In other words, people in New York will have some messages on their social media. 211 00:12:00.870 --> 00:12:02.760 This message is about the city. 212 00:12:02.760 --> 00:12:03.270 That's not good 213 00:12:03.270 --> 00:12:04.440 Just complain a little 214 00:12:05.360 --> 00:12:10.030 Then there are scholars to these complaints of the message to analyze it. 215 00:12:10.030 --> 00:12:11.490 Semantic analysis 216 00:12:12.370 --> 00:12:15.830 There are three main types of speech analysis. 217 00:12:16.060 --> 00:12:17.430 Three types of complaints 218 00:12:17.430 --> 00:12:20.090 One is the storm about the noise of the city. 219 00:12:21.280 --> 00:12:23.990 Complaints of noise are mainly concentrated in the south. 220 00:12:25.210 --> 00:12:28.350 So we know this place in the city includes automatically 221 00:12:29.010 --> 00:12:31.990 Then the public sensed that it was too noisy. 222 00:12:33.590 --> 00:12:34.200 Too loud 223 00:12:34.200 --> 00:12:36.140 There used to be data like this 224 00:12:36.140 --> 00:12:38.850 This data may be noise monitoring. 225 00:12:39.220 --> 00:12:42.220 But we now have this data that people feel 226 00:12:44.040 --> 00:12:45.540 What does that mean? 227 00:12:46.400 --> 00:12:47.800 It means the age of big data 228 00:12:47.800 --> 00:12:50.050 Our people have become a sensor 229 00:12:52.190 --> 00:12:54.500 We see with our eyes. 230 00:12:54.500 --> 00:12:56.150 With our ears 231 00:12:56.470 --> 00:12:58.100 To smell with our noses 232 00:12:58.490 --> 00:13:04.690 But these things are good through the photos we left on the Internet. 233 00:13:04.690 --> 00:13:07.110 Reflect all aspects of the city. 234 00:13:07.350 --> 00:13:10.610 It will be reflected in every space. 235 00:13:11.240 --> 00:13:13.630 So this is where there's a lot of noise 236 00:13:14.030 --> 00:13:18.260 It is surrounded by graffiti. 237 00:13:18.590 --> 00:13:21.410 It also tells young people that there is life everywhere. 238 00:13:21.450 --> 00:13:22.830 Especially in these places 239 00:13:24.420 --> 00:13:26.110 Cause some dissatisfaction 240 00:13:27.070 --> 00:13:31.250 Then litter the suburbs 241 00:13:33.320 --> 00:13:34.290 Very dirty 242 00:13:34.520 --> 00:13:36.210 These caused everyone's dissatisfaction. 243 00:13:37.120 --> 00:13:41.120 So this picture tells us not only what people in this city perceive 244 00:13:41.350 --> 00:13:42.070 Not here 245 00:13:42.070 --> 00:13:42.690 It's not good there 246 00:13:42.690 --> 00:13:47.280 And, more importantly, it foreshadows that man was in the age of big data. 247 00:13:47.280 --> 00:13:48.280 It's a sensor 248 00:13:48.280 --> 00:13:49.990 Is a moving sensor 249 00:13:50.410 --> 00:13:54.930 Can send all kinds of information to us as a record.