Skip to contents

Introduction to swissechangedata’s newsboard_data() function

This vignette offers a glimpse on the inner functions of the swissexchangedatapackage, built up for educational purpose to learn and sharpen skills in web-scrapping of data from online sources, data wrangling and package development.

When extracting the Swiss Stock Exchange data from the Six Group newsboard website, you must:

  • Define the Date Range, Message Type, Market and Products.
  • Open in separate view for printing and saving into a pdf file

The swissexchangedata facilitates ease of defining the desired parameters and accessing the data directly into your workspace, making use of the currently existing function news newsboard_data().

More functions will be added to capture access of data from other tabs provided on the company website.

The data range capture only 2 years of data, so the stored data requires updating over time to capture most of the data. Looking forward to have various versions of data that update over time.

The scrapped data is cleaned further to add nested data.

Data Access

Get the data for September 2023 up-to-date:

all_items_data_sep_2024 <- newsboard_data(firstDate = '2023-10-01', lastDate = Sys.Date())
#> [1] "Page 0"
#> [1] "Page 0 has 10 observations."
#> [1] "Page 1"
#> [1] "Page 1 has 10 observations."
#> [1] "Page 2"
#> [1] "Page 2 has 10 observations."
#> [1] "Page 3"
#> [1] "Page 3 has 10 observations."
#> [1] "Page 4"
#> [1] "Page 4 has 10 observations."
#> [1] "Page 5"
#> [1] "Page 5 has 10 observations."
#> [1] "Page 6"
#> [1] "Page 6 has 10 observations."
#> [1] "Page 7"
#> [1] "Page 7 has 10 observations."
#> [1] "Page 8"
#> [1] "Page 8 has 10 observations."
#> [1] "Page 9"
#> [1] "Page 9 has 10 observations."
#> [1] "Page 10"
#> [1] "Page 10 has 10 observations."
#> [1] "Page 11"
#> [1] "Page 11 has 10 observations."
#> [1] "Page 12"
#> [1] "Page 12 has 10 observations."
#> [1] "Page 13"
#> [1] "Page 13 has 10 observations."
#> [1] "Page 14"
#> [1] "Page 14 has 10 observations."
#> [1] "Page 15"
#> [1] "Page 15 has 10 observations."
#> [1] "Page 16"
#> [1] "Page 16 has 10 observations."
#> [1] "Page 17"
#> [1] "Page 17 has 10 observations."
#> [1] "Page 18"
#> [1] "Page 18 has 8 observations."
#> [1] "Page 19"
#> [1] "Page 19 has  observations."
all_items_data_sep_2024

Data Cleaning

Apparently, the newsText column contains further tables on the trades, for which access require further cleaning. The cleaning code is also used in cleaning data package data swissexchangedata::newsboardmarketdata and replicated herein.

# obtain the preceding text
itemlist_newsText_clean <-function(vec){
  # obtain the preceding text
  read_html(vec) |> html_text()
}
# obtain the data table
itemlist_newsText_table_clean <-function(vec){
  
  # obtain the html tabke from the vector
  temp_obj = read_html(vec) |> html_table(fill=TRUE) #|> unlist()
  # |>  do.call(rbind, lapply(., as.data.frame))
  
  # if no data, skip, else convert to dataframe
  if (!is.null(temp_obj)){
    temp_obj = temp_obj |> as.matrix() |>
      t() |>
      as.data.frame() #|> unlist()
  } else {
    NA
  }
  temp_obj
}
# get copy of data to all_items_data_sep_2024
all_items_data_sep_2024_original <- all_items_data_sep_2024
# extract text and table from newsText through apply
newsText_column_index <- grep("newsText", colnames(all_items_data_sep_2024))
#
all_items_data_sep_2024$data <- apply(
  X = all_items_data_sep_2024[,newsText_column_index,drop=FALSE],
  MARGIN = 1, simplify = T,
  FUN = itemlist_newsText_clean)
#
all_items_data_sep_2024$table_data <- apply(
  X = all_items_data_sep_2024[,newsText_column_index,drop=FALSE],
  MARGIN = 1, simplify = F,
  FUN = itemlist_newsText_table_clean)

Output

all_items_data_sep_2024[1:2,1:15]
#>   messageNo         isin valorSymbol                                      title
#> 1    209039 CH1139756881      ZSMIAZ Mistrade Decision in ZSMIAZ / CH1139756881
#> 2    209038 LU0950674761      EMUSRI Mistrade Decision in EMUSRI / LU0950674761
#>   messageType broadcastDateTime                    security      tradingSegment
#> 1    Mistrade       2.02402e+13                ZSMIAZ ZKB C Structured Products
#> 2    Mistrade       2.02402e+13 UBSETF MSCI EMU SRI EUR ACC                 ETF
#>   priority markets products currency
#> 1   Normal    XQMH       DE      CHF
#> 2   Normal    XSWX       FU      EUR
#>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         newsText
#> 1                <div><p>In accordance with the rules of SIX Swiss Exchange, the following trade in <strong>'ZSMIAZ ZKB C'</strong> has been declared a mistrade and has therefore been cancelled: </p><table><thead><tr><th align="center" style="width: 75px">Trade Date</th><th align="center" style="width: 75px">Time</th><th align="center" style="width: 45px">Cur</th><th align="left" style="width: 75px">Size</th><th align="left" style="width: 75px">Price</th><th align="left" style="width: 75px">Trade Type</th><th align="left" style="width: 75px">Book Type</th><th align="left" style="width: 75px">Ref Exch</th></tr></thead><tbody><tr><td align="center">02.02.2024</td><td align="center">09:15:58</td><td align="center">CHF</td><td align="left">17</td><td align="left">875.0000</td><td align="left">OnExchange</td><td align="left">QuoteBook</td><td align="left"/></tr></tbody></table><p>Please find further information concerning mistrades in Directive 4: Market Control on our website.</p><p>Regards,<br/>Exchange Operations, SIX Swiss Exchange</p></div>
#> 2 <div><p>In accordance with the rules of SIX Swiss Exchange, the following trade in <strong>'UBSETF MSCI EMU SRI EUR ACC'</strong> has been declared a mistrade and has therefore been cancelled: </p><table><thead><tr><th align="center" style="width: 75px">Trade Date</th><th align="center" style="width: 75px">Time</th><th align="center" style="width: 45px">Cur</th><th align="left" style="width: 75px">Size</th><th align="left" style="width: 75px">Price</th><th align="left" style="width: 75px">Trade Type</th><th align="left" style="width: 75px">Book Type</th><th align="left" style="width: 75px">Ref Exch</th></tr></thead><tbody><tr><td align="center">01.02.2024</td><td align="center">16:00:29</td><td align="center">EUR</td><td align="left">200</td><td align="left">22.6600</td><td align="left">OnExchange</td><td align="left">QuoteBook</td><td align="left"/></tr></tbody></table><p>Please find further information concerning mistrades in Directive 4: Market Control on our website.</p><p>Regards,<br/>Exchange Operations, SIX Swiss Exchange</p></div>
#>   newsTypeCode
#> 1           MI
#> 2           MI
#>                                                                                                                                                                                                                                                                                                                                                                                                                                data
#> 1                In accordance with the rules of SIX Swiss Exchange, the following trade in 'ZSMIAZ ZKB C' has been declared a mistrade and has therefore been cancelled: Trade DateTimeCurSizePriceTrade TypeBook TypeRef Exch02.02.202409:15:58CHF17875.0000OnExchangeQuoteBookPlease find further information concerning mistrades in Directive 4: Market Control on our website.Regards,Exchange Operations, SIX Swiss Exchange
#> 2 In accordance with the rules of SIX Swiss Exchange, the following trade in 'UBSETF MSCI EMU SRI EUR ACC' has been declared a mistrade and has therefore been cancelled: Trade DateTimeCurSizePriceTrade TypeBook TypeRef Exch01.02.202416:00:29EUR20022.6600OnExchangeQuoteBookPlease find further information concerning mistrades in Directive 4: Market Control on our website.Regards,Exchange Operations, SIX Swiss Exchange

The data column on trades has been extracted from the HTML paragraph.

all_items_data_sep_2024[1,16][[1]][[1]]
#> [[1]]
#> # A tibble: 1 × 8
#>   `Trade Date` Time     Cur    Size Price `Trade Type` `Book Type` `Ref Exch`
#>   <chr>        <chr>    <chr> <int> <dbl> <chr>        <chr>       <lgl>     
#> 1 02.02.2024   09:15:58 CHF      17   875 OnExchange   QuoteBook   NA