Google: Ingesting New Candidates

Update Outstanding Candidates

Ingest New Candidates

Export Oak Candidates

Create Text Files for Caia Pull Jobs

 


Update Outstanding Candidates

Before ingesting new candidates, update outstanding candidates:

Run reports

  1. open Google DB
  2. export update 1 – outstanding candidates as text on desktop
  3. add a “Barcode” header line
  4. in Alma, add a set: outstanding candidates
  5. Manage Sets –> Members –> then Export list to desktop
  6. Monitor Jobs –> History –> Report –> download Non-matched Records to desktop

Reject for location

  1. on desktop, open results.xlsx
  2. filter Permanent Location for locations that don’t go to google
  3. copy those barcodes and save as barcodes.txt
  4. check Caia for locations in departmental libraries–update location in candidates table and deaccession as needed
  5. import barcodes.txt in Google DB
  6. run update 2 – reject for location

Bad barcodes

  1. open a new excel spreadsheet and import NonMatchedRecords.csv
  2. copy barcodes and save as barcodes.txt
  3. close spreadsheet without saving
  4. import barcodes.txt in Google DB
  5. export update 3 – bad barcode data as excel on desktop
  6. open spreadsheet, copy Item ID column without header
  7. in analytics, open barcode & loc from items then export to desktop
  8. open spreadsheet and adjust column widths, then save
  9. import barcode & loc from items into Google DB as temp
  10. run update 4 – fix barcode and loc
  11. open spreadsheet again, filter for unusual locations, edit candidates table as needed

 

Ingest New Candidates

Google candidate lists

  1. download each file (except combined) from the most recent date. EXCLUDE us_federal_gov_docs until further notice
  2. an a piece of paper, note that date.
  3. 7-Zip -> Extract Here [twice]

Excel

  1. create a new excel spreadsheet named google ingest with the following columns (25 wide):
    • Google Candidate List (Text)
    • Google List Date (Date)
    • Barcode (Text)
  2. in a new excel spreadsheet, Get Data From Text, and select a file
    • Delimited | Next
    • Next
    • scroll to highlight the barcode field and choose Text | Finish
    • OK
    • scroll down to first empty A cell
    • repeat for all files for that Candidate List
  3. Format Cells in the barcodes column as Text
    • filter barcodes for does not begin with: 3
    • paste barcode over or delete row as appropriate
  4. copy the list of barcodes and paste them into google ingest
  5. enter the correct value for Google_Candidate_List:
    • International Documents
    • PreDate
    • Renewal
    • State Documents
    • US Documents     EXCLUDE until further notice
  6. enter the date of the candidate files (MM/DD/YYYY).
  7. copy A and B to another line, autofill down (watch for incrementing!)
  8. close and delete the candidate files.
  9. repeat 2-8 for all candidate files.
  10. deduplicate by barcode.
  11. delete all blank lines at the bottom of the sheet.
  12. note the number of unique barcodes (to the k)

Notepad

  1. Copy the barcodes into files named google ingest 01.txt, etc. Files must have less than 100k lines.

Alma

  1. Manage Sets –> Add Set as google ingest 01, etc.
  2. Wait for email confirmation
  3. from the confirmation email, note the total number of  matches in Alma (to the k)
  4. in Firefox, view Members of a set, then click the Export List icon and choose Excel (current view)
  5. once the download is complete, rename the file google ingest 01.xlsx, etc.

Excel

  1. delete OCLC is blank
  2. delete host bib records
  3. delete all lines that aren’t in the correct locations (only osx, osx-nc, or stx as of 2017)
  4. note the remaining number of  candidates (to the k)

Alma analytics

  1. create a text file of all remaining candidate barcodes
  2. use the AlmaSet2Analytics procedure for each group of <64k barcodes

sandbox database

  1. compact and repair database
  2. import google ingest and alma ingest spreadsheets into sandbox database
  3. verify that ingest table fields match dbo_NewCandidates
  4. compact and repair database
  5. run create google ingest table
  6. delete google ingest and alma ingest tables
  7. compact and repair database

Google database

Starting 2017, we are checking outstanding candidates against the new candidate list, and rejecting any candidates that had fallen off the list. (Reject Code = Z). This stage also removes items that are already in the NewCandidates table.

  1. run preingest 1 – create outstanding table
  2. run preingest 2 – delete outstanding if on ingest
  3. run preingest 3 – update from outstanding, entering 1/1/<currentyear>
  4. run preingest 4 – delete if already candidate
  5. note the number of outstanding candidates, repeat candidates, and remaining candidates (to the k)
  6. rename google ingest table as new to ingest
  7. delete outstanding table
  8. compact and repair database

Google database

  1. open new to ingest in design view and add the following fields
    • Put on Picklist: Date/Time
    • Shipped: Yes/No
    • Reject Code: Short Text
  2. compact and repair database
  3. run the ingest reject – queries
  4. compact and repair database
  5. run ingest 5 – append to candidates
    • append each category for Google_Candidate_List separately, if necessary
  6. compact and repair database
  7. run ingest 7- verify ingest count
  8. delete google ingest and new to ingest
  9. compact and repair database
  10. export Google candidates for Oak barcodes as text

 

Export Oak Candidates

Caia

  1. break Google candidates for Oak barcodes into files of no more than 100k
  2. run item status report for each file
  3. wait for email, then download results

Excel

  1. open new spreadsheet and save as item status report
  2. Get Data From Text
  3. delete:
    • all columns after Last Job
    • Item Status: all but “Item In at Rest”
    • Collection: all but osx or osx-nc
    • Container begins with: Direct, EL, EX, O, G
    • Address: 0
    • Last Job Type: De-Accession Items
  4. remove duplicates
  5. delete lines and columns at end of results
  6. save
  7. insert an empty column A
    • A1: Shelf
    • A2: =CONCATENATE(H2,I2,J2,”:”,K2)     [correct quotes, if needed]
    • fill down
    • save
  8. sort:
    • Aisle
    • Ladder
    • Step
    • Side
    • Footprint
    • Container
  9. Save and close

sandbox db

  1. run Import-item status report with shelf
  2. open design view
  3. add Blank field as short text
  4. export google lows
  5. run lowhi 2 – delete lows
  6. export google highs
  7. delete item status report with shelf

Excel

  1. open each file and auto-adjust column width, then save
  2. email files to Google team

 

Create Text Files for Caia Pull Jobs

Set up folders and files

  1. start in: G:\OakStreet\Google Project\Google Candidates Process
  2. copy the Google 20xx folder (and subfolders) and paste in:
    G:\OakStreet\Google Project
  3. rename the Google 20xx folder to the current year
  4. save the export google highs and export google lows files from email in:
    G:\OakStreet\Google Project\Google 20xx\Splitting
  5. create working files in Splitting
    • if there are fewer than 100k lines in a file, copy export google highs and paste as export google highs working
    • if there are more than 100k lines in a file, break the file into <100k files
      • keep aisles together–use a pivot table of aisle counts to determine where to break up the file
      • copy export google highs and paste with aisle numbers in the filename, e.g.: export google highs 1-7
      • filter the aisle column and unselect the aisles you want to keep
      • delete remaining lines
      • unfilter and save

Split into aisles

  1. open folders in separate windows:
    G:\OakStreet\Google Project\Google Candidates Process [contains text files of VBA scripts]
    G:\OakStreet\Google Project\Google 20xx\Splitting [contains excel files of candidates]
  2. open appropriate working file in Splitting folder
  3. Alt-F11 –> Insert –> Module –> copy/paste script 1… –> F5
  4. close both VBA windows and save the excel file
  5. Alt-F11 –> Insert –> Module –> copy/paste script 2 –> F5 –> browse to
    G:\OakStreet\Google Project\Google 20xx\Splitting\Google Highs Excel
  6. close both VBA windows, then save and close excel
  7. repeat as needed

Change codenames

  1. open a new excel file
  2. File –> Options –> Trust Center –> Trust Center Settings –> Macro Settings –>  check: Trust access to the VBA project object model
  3. Alt-F11 –> Insert –> Module –> copy/paste script 3 –> F5 –> browse to
    G:\OakStreet\Google Project\Google 20xx\Splitting\Google Highs Excel
  4. close both VBA windows and excel without saving

Chunk out job files into aisle folders

  1. open Highs Aisle 1.xlsx
  2. Alt-F11 –> Insert –> Module –> copy/paste script 4 –> F5
  3. close both VBA windows and save the excel file
  4. Alt-F11 –> Insert –> Module –> copy/paste script 5 –> F5 –> browse to
    G:\OakStreet\Google Project\Google 20xx\Text Files for Jobs\Highs\Aisle 1
  5. close both VBA windows, then save and close excel
  6. repeat as needed

Once all the highs files have been created, repeat for lows, starting with Split into aisles