Analytics mindset ETL Case 2 – Text extraction and unique identifiers – Excel In older computer…

Analytics mindset ETL Case 2 – Text extraction and unique identifiers – Excel In older computer….

Analytics mindset

ETL

Case 2 – Text extraction and unique identifiers – Excel

In older computer systems, multiple values were often stored in a single cell to save space. This practice is sometimes still followed today. For example, an employee identification number may tell you the employee number, plant number and business function. That is, 143-01-Acc could identify employee 143, from plant 01, who works in accounting.

For this case, you are provided with an Excel data file, Analytics_mindset_case_studies_Case2_Excel.xlsx, that contains 597 rows of employee data. In the tab labeled Case 2 data, you will find three columns: EmployeeCode, FirstName and LastName. The EmployeeCode is the combination of four different fields: Location, EmpID, PlantID and PayPeriod. Each of these fields is defined as follows:

  • Location: The location code shows the location where the employee works. The company operates in eight different countries: Argentina (ARG), Australia (AUS), Canada (CAN), England (ENG), Germany (GER), Japan (JAP), Mexico (MEX) and the United States of America (USA). The country codes are always three digits and are the first three digits in the EmployeeCode, reading from left to right.
  • EmpID: The company assigns a random employee identification number from 1,000 to 1,597. Reading the EmployeeCode from left to right, the EmpID is the first set of numbers immediately after the Location code and preceding the hyphen (-).
  • PlantID: The company has various plants throughout the different countries. Each country numbers its plants starting at 10, and adds one more number for each additional plant. The PlantID is contained in the EmployeeCode, reading left to right, immediately after the hyphen (-).
  • PayPeriod: Employees are paid either weekly or monthly. The system records this as a W for weekly and as M for monthly. The PayPeriod is the last letter reading from left to right in the EmployeeCode.

You have been asked by your manager to extract data using the employee code and also create a new unique identifier that will provide the plant number by location.

Analytics mindset ETL Case 2 – Text extraction and unique identifiers – Excel In older computer…