geoinfo2223:groupb:start
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
geoinfo2223:groupb:start [2023/03/31 23:00] – sindhya-babu.rajendra-babu | geoinfo2223:groupb:start [2023/03/31 23:39] (current) – [Webscraping of Water gauge Stations from Emscher Genossenschaft Lippe Verband website] sahil001 | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ======= Geo Informatics Final Project : Group B ======= | + | // |
- | ==== M-IE_2.02 Geoinformatics, | + | |
- | ** Under supervision of : Prof. Rolf Becker ** | + | |
- | ===== Participants: | + | |
- | ** 1- Sindhya Babu - 29928 ** | + | ======= Webscraping of Water gauge Stations from Emscher Genossenschaft Lippe Verband website ======= |
- | ** 2- Kiara Meço - 32358 ** | + | {{ : |
- | + | ||
- | ** 3- Sahil Chande.- 29927 ** | + | |
===== Introduction ===== | ===== Introduction ===== | ||
Line 56: | Line 51: | ||
- | The data extracted for one station is shown below. The data frame contains two values ‘Station’ and ‘Station Values’. The Station Values column is then split to several columns and renamed and stored as a data frame. | + | The data extracted for one station is shown below. The data frame contains two values ‘Station’ and ‘Station Values’. The Station Values column is then split to several columns and renamed and stored as a new data frame. |
{{: | {{: | ||
Line 62: | Line 57: | ||
** Figure 3: Python code showing extracting text of station name and values for KA Hamm ** | ** Figure 3: Python code showing extracting text of station name and values for KA Hamm ** | ||
- | After looping over, we found that several PIDVal contained no data. We drop these rows and now store the new data frame with non-null | + | After looping over, we found that several PIDVal contained no data. We drop these rows with no data and now store the new data frame with non-null |
{{: | {{: | ||
Line 68: | Line 63: | ||
{{: | {{: | ||
- | ** Figure 4: Data frame showing the data types and number of non-null column values. ** | + | ** Figure 4: Data frame showing the data types and a number of non-null column values. ** |
- | The geo-coordinates values of Rechtswert and Hochwert | + | The geo-coordinates values of Rechtswert and Hochwert |
- | The below figure shows an example of how geo data frame, gdf look like. | + | The below figure shows an example of what geo data frame, gdf looks like. |
{{: | {{: | ||
Line 78: | Line 73: | ||
** Figure 5: Geo data frame containing geometry column as shapely points ** | ** Figure 5: Geo data frame containing geometry column as shapely points ** | ||
- | ===== Storing the water stations | + | ===== Storing the master data of Water Stations |
- | We create a data base env_db and a new schema named ‘eglv’ is created under the data base using super user env_master. Under this schema we create a table ‘eglv_stations’ and upload the geo data frame to the table ‘eglv_stations’. The connection to the PostGIS database from python is enabled by creating a connection engine using sqlalchemy package and we pass this connection engine to_postgis. With chucksize=100, | + | We create a database //env_db// and a new schema named //‘eglv’// is created under the database |
{{: | {{: | ||
Line 91: | Line 86: | ||
** Figure 7: ‘eglv_stations’ table created under schema eglv shown in PgAdmin 4 ** | ** Figure 7: ‘eglv_stations’ table created under schema eglv shown in PgAdmin 4 ** | ||
+ | |||
+ | |||
+ | Next, we use a select query to query the table ‘eglv_stations’ to get all the rows and check if all the data has been uploaded correctly. | ||
+ | |||
{{: | {{: | ||
Line 98: | Line 97: | ||
===== Plotting the co-ordinates in Qgis ===== | ===== Plotting the co-ordinates in Qgis ===== | ||
- | In QGIS we select the EPSG: 31466 as Projected Coordinate Reference System (CRS) which is the DHDN / 3-degree Gauss-Kruger zone 2 corresponding to the co-ordinate system used by the Emscher Genossenschaft Lippe Verband. We first add PostGIS layer and connect to our data base. After successfully connecting to the data base by entering the super user credentials , we can see that the eglv schema and eglv_station | + | In QGIS we select the //EPSG: 31466// as the Projected Coordinate Reference System (CRS) which is the// DHDN / 3-degree Gauss-Kruger zone 2// corresponding to the co-ordinate system used by the Emscher Genossenschaft Lippe Verband. We first add the PostGIS layer and connect |
{{: | {{: | ||
- | After successful connection to Postgis. | + | After a successful connection to Postgis. |
- | As a base layer, Topographische | + | As a base layer, |
{{: | {{: | ||
- | Here in the below figure we can see the zoomed out map with all stations with dark red dot with same map Topographische NRW DTK100 Farbe and also projected in EPSG: 31466 co-ordinate | + | Here in the below figure, we can see the zoomed-out map with all stations with dark red dots with the same map Topographische NRW DTK100 Farbe and also projected in EPSG: 31466 coordinate |
{{: | {{: | ||
Line 114: | Line 113: | ||
** Figure 9: The station locations plotted on NRW Topographische Karte Map in EPSG: 31466 CRS ** | ** Figure 9: The station locations plotted on NRW Topographische Karte Map in EPSG: 31466 CRS ** | ||
- | Figure 10 shows the snippet of the location of few of the stations with the scale of 1 to 1 million. dark red dots are used to mark the station on NRW Topographische Karte Map. | + | Figure 10 shows the snippet of the location of a few of the stations with a scale of 1:1000000. Dark red dots are used to mark the station on the WMS layer. |
{{: | {{: | ||
Line 120: | Line 119: | ||
** Figure 10: The station locations plotted on NRW Topographische Karte Map in EPSG: 31466 CRS on scale 1:1000000 ** | ** Figure 10: The station locations plotted on NRW Topographische Karte Map in EPSG: 31466 CRS on scale 1:1000000 ** | ||
- | while plotting exact points on map it is also important to take the background map similar to one which we have for the refrencing. Here in the figure 11 below you can see first image as the selected QGIS map for plotting stations and the second image show the map which they have on the website. | + | While plotting exact points on the map it is also important to take a background map similar to the one we have for referencing. Here in figure 11 below it can be seen that the first image is the selected QGIS map for plotting stations and the second image shows the map which they have on the website. |
{{: | {{: | ||
Line 128: | Line 127: | ||
** Figure 11: Comparison between KA Hamm Station in QGIS Vs KA Hamm Station in Emscher Genossenschaft Lippe Verband web page. ** | ** Figure 11: Comparison between KA Hamm Station in QGIS Vs KA Hamm Station in Emscher Genossenschaft Lippe Verband web page. ** | ||
- | In figure 12 we can see that all the stations | + | In figure 12 we can see that all the stations are listed on the Emscher Genossenschaft Lippe Verband web page with coordinates data shown below with custom-made location |
{{: | {{: | ||
- | ** Figure 12: All stations which are listed on Emscher Genossenschaft Lippe Verband web page marked with custom symbol. ** | + | ** Figure 12: All stations which are listed on the Emscher Genossenschaft Lippe Verband web page marked with a custom symbol. ** |
====== Periodic Web Scraping of ' | ====== Periodic Web Scraping of ' |
geoinfo2223/groupb/start.1680296421.txt.gz · Last modified: 2023/03/31 23:00 by sindhya-babu.rajendra-babu