User Tools

Site Tools


geoinfo2223:groupb:start

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
geoinfo2223:groupb:start [2023/03/31 23:02] sindhya-babu.rajendra-babugeoinfo2223:groupb:start [2023/03/31 23:39] (current) – [Webscraping of Water gauge Stations from Emscher Genossenschaft Lippe Verband website] sahil001
Line 1: Line 1:
-======= Geo Informatics Final Project : Group B ======= +//Contributors:// Sindhya Babu, Kiara Meço, and Sahil Chande
-==== M-IE_2.02 Geoinformatics, WS2022/23 ==== +
-** Under supervision of Prof. Rolf Becker ** +
-===== Participants:=====+
  
-** 1- Sindhya Babu - 29928 **+======= Webscraping of Water gauge Stations from Emscher Genossenschaft Lippe Verband website =======
  
-** 2- Kiara Meço   - 32358 ** +{{ :geoinfo2223:groupb:cover.png?700 |}}
- +
-** 3- Sahil Chande.- 29927 **+
  
 ===== Introduction ===== ===== Introduction =====
Line 62: Line 57:
 ** Figure 3: Python code showing extracting text of station name and values for KA Hamm ** ** Figure 3: Python code showing extracting text of station name and values for KA Hamm **
  
-After looping over, we found that several PIDVal contained no data. We drop these rows with no data and now store the new data frame with non-null values. The new data frame contains 131 Stations and only 103 stations had geo-coordinates data available as shown below+After looping over, we found that several PIDVal contained no data. We drop these rows with no data and now store the new data frame with non-null values. The new data frame contains 131 Stations and only 103 stations had geo-coordinates data available as shown under figure 4
  
 {{:geoinfo2223:groupb:fig4_1.png?400|}} {{:geoinfo2223:groupb:fig4_1.png?400|}}
Line 68: Line 63:
 {{:geoinfo2223:groupb:fig4_2.png?400|}} {{:geoinfo2223:groupb:fig4_2.png?400|}}
  
-** Figure 4: Data frame showing the data types and number of non-null column values. **+** Figure 4: Data frame showing the data types and number of non-null column values. **
  
-The geo-coordinates values of Rechtswert and Hochwert stred in the above data frame is still of data type float63. Since we want our co-ordinates to be recognized as geographic location data, we use    geoPandas package in python, to convert pandas data frame to a geo data frame or gdf. Since a geo data frame requires a shapely object, we pass the columns containing Easting and Northing values i.e Rechtswert_(Gauss-Krüger), Hochwert_(Gauss-Krüger) respectively are into the function points_from_xy to transform it to shapely point.+The geo-coordinates values of Rechtswert and Hochwert stored in the above data frame are still of data type float63. Since we want our coordinates to be recognized as geographic location data, we use the geoPandas package in python, to convert the pandas data frame to a geo data frame or gdf. Since a geo data frame requires a shapely object, we pass the columns containing Easting and Northing values i.e Rechtswert_(Gauss-Krüger), Hochwert_(Gauss-Krüger) respectively into the function points_from_xy to transform it to shapely points.
  
-The below figure shows an example of how geo data frame, gdf look like. +The below figure shows an example of what geo data frame, gdf looks like. 
  
 {{:geoinfo2223:groupb:fig5.png?400|}} {{:geoinfo2223:groupb:fig5.png?400|}}
Line 78: Line 73:
 ** Figure 5: Geo data frame containing geometry column as shapely points ** ** Figure 5: Geo data frame containing geometry column as shapely points **
  
-=====  Storing the water stations master data in PostgreSQL database =====+=====  Storing the master data of Water Stations in PostgreSQL database =====
  
-We create a data base env_db and a new schema named ‘eglv’ is created under the data base using super user env_master. Under this schema we create a table ‘eglv_stations’ and upload the geo data frame to the table ‘eglv_stations’. The connection to the PostGIS database from python is enabled by creating a connection engine using sqlalchemy package and we pass this connection engine to_postgis. With chucksize=100, 100 rows will be written at a time to the data base.+We create a database //env_db// and a new schema named //‘eglv’// is created under the database using super user// env_master//. Under this schemawe create a table //‘eglv_stations’// and upload the geo data frame to the table //‘eglv_stations’//. The connection to the PostGIS database from python is enabled by creating a connection engine using sqlalchemy package and we pass this connection engine to_postgis. With chucksize=100, 100 rows will be written at a time to the database. This is shown under figure 6,7. But since the data frame contains only 131 rows, chuksize does not play a significant role when compared to data base with larger values
  
 {{:geoinfo2223:groupb:screenshot_2023-03-31_at_11.37.34_am.png?400|}} {{:geoinfo2223:groupb:screenshot_2023-03-31_at_11.37.34_am.png?400|}}
Line 91: Line 86:
  
 ** Figure 7: ‘eglv_stations’ table created under schema eglv shown in PgAdmin 4 ** ** Figure 7: ‘eglv_stations’ table created under schema eglv shown in PgAdmin 4 **
 +
 +
 +Next, we use a select query to query the table ‘eglv_stations’ to get all the rows and check if all the data has been uploaded correctly. 
 +
  
 {{:geoinfo2223:groupb:fig7.png?400|}} {{:geoinfo2223:groupb:fig7.png?400|}}
Line 98: Line 97:
 ===== Plotting the co-ordinates in Qgis ===== ===== Plotting the co-ordinates in Qgis =====
  
-In QGIS we select the EPSG: 31466 as Projected Coordinate Reference System (CRS) which is the DHDN / 3-degree Gauss-Kruger zone 2 corresponding to the co-ordinate system used by the Emscher Genossenschaft Lippe Verband. We first add PostGIS layer and connect to our data base. After successfully connecting to the data base by entering the super user credentials , we can see that the eglv schema and eglv_station is available, shown as in the below figure. +In QGIS we select the //EPSG: 31466// as the Projected Coordinate Reference System (CRS) which is the// DHDN / 3-degree Gauss-Kruger zone 2// corresponding to the co-ordinate system used by the Emscher Genossenschaft Lippe Verband. We first add the PostGIS layer and connect it to our database. After successfully connecting to the database by entering the superuser credentials, we can see that the eglv schema and eglv_station are available, as shown in the below figure. 
  
 {{:geoinfo2223:groupb:ps8.png?400|}} {{:geoinfo2223:groupb:ps8.png?400|}}
  
-After successful connection to Postgis.+After successful connection to Postgis.
  
-As a base layer, Topographische NRW DTK100 Farbe Map is added , also projected as EPSG: 31466 co-ordinate system as shown in the below figure. The inverted triangles indicate the location of the stations. +As a base layer, we add WMS layer - > //NW Digitale Topographische Karten DTK100 Farbe// Map is added, also projected as EPSG: 31466 coordinate system as shown in the below figure. The inverted triangles indicate the location of the stations. 
  
 {{:geoinfo2223:groupb:fig9.png?400|}} {{:geoinfo2223:groupb:fig9.png?400|}}
  
-Here in the below figure we can see the zoomed out map with all stations with dark red dot with same map Topographische NRW DTK100 Farbe and also projected in EPSG: 31466 co-ordinate system.+Here in the below figurewe can see the zoomed-out map with all stations with dark red dots with the same map Topographische NRW DTK100 Farbe and also projected in EPSG: 31466 coordinate system.
  
 {{:geoinfo2223:groupb:fig9_1.png?400|}} {{:geoinfo2223:groupb:fig9_1.png?400|}}
Line 114: Line 113:
 ** Figure 9: The station locations plotted on NRW Topographische Karte Map in EPSG: 31466 CRS ** ** Figure 9: The station locations plotted on NRW Topographische Karte Map in EPSG: 31466 CRS **
  
-Figure 10 shows the snippet of the location of few of the stations with the scale of 1 to 1 milliondark red dots are used to mark the station on NRW Topographische Karte Map.+Figure 10 shows the snippet of the location of few of the stations with scale of 1:1000000Dark red dots are used to mark the station on the WMS layer
  
 {{:geoinfo2223:groupb:fig10.png?400|}} {{:geoinfo2223:groupb:fig10.png?400|}}
Line 120: Line 119:
 ** Figure 10: The station locations plotted on NRW Topographische Karte Map in EPSG: 31466 CRS on scale 1:1000000 ** ** Figure 10: The station locations plotted on NRW Topographische Karte Map in EPSG: 31466 CRS on scale 1:1000000 **
  
-while plotting exact points on map it is also important to take the background map similar to one which we have for the refrencing. Here in the figure 11 below you can see first image as the selected QGIS map for plotting stations and the second image show the map which they have on the website. both the maps shows the location of station in KA Hamm.+While plotting exact points on the map it is also important to take background map similar to the one we have for referencing. Here in figure 11 below it can be seen that the first image is the selected QGIS map for plotting stations and the second image shows the map which they have on the website. Both maps show the location of the station in KA Hamm.
  
 {{:geoinfo2223:groupb:fig11.png?400|}} {{:geoinfo2223:groupb:fig11.png?400|}}
Line 128: Line 127:
 ** Figure 11: Comparison between KA Hamm Station in QGIS Vs KA Hamm Station in Emscher Genossenschaft Lippe Verband web page. ** ** Figure 11: Comparison between KA Hamm Station in QGIS Vs KA Hamm Station in Emscher Genossenschaft Lippe Verband web page. **
  
-In figure 12 we can see that all the stations which are listed on Emscher Genossenschaft Lippe Verband web page with coordinates data are shown below with custom made location marker in dark blue colour.+In figure 12 we can see that all the stations are listed on the Emscher Genossenschaft Lippe Verband web page with coordinates data shown below with custom-made location markers in dark blue color.
  
 {{:geoinfo2223:groupb:fig12.png?400|}} {{:geoinfo2223:groupb:fig12.png?400|}}
  
-** Figure 12: All stations which are listed on Emscher Genossenschaft Lippe Verband web page marked with custom symbol. **+** Figure 12: All stations which are listed on the Emscher Genossenschaft Lippe Verband web page marked with custom symbol. **
  
 ====== Periodic Web Scraping of 'Aktuelle Pegelstände für Emscher und Lippe' ====== ====== Periodic Web Scraping of 'Aktuelle Pegelstände für Emscher und Lippe' ======
geoinfo2223/groupb/start.1680296547.txt.gz · Last modified: 2023/03/31 23:02 by sindhya-babu.rajendra-babu