-1

I have a data set that has two columns. AccountName and AccountNumber. It has 35 rows. I want to create a new dataframe with AccountName, AccountNumber, and LocationNumber. LocationNumber is stored in another data frame with 1 column that has 350 rows.

so basically for each account name and number, foreach location number, add another row with the account name + number + location number. so if I have 35 account numbers and 350 locations, the end goal is to have 12,250 rows. i've tried using for loop to no avail.

accounts (name | number)

STR EXP-VACATION ESTIMATE-0200900   200900
STR EXP-HOLIDAY PAY-0200920 200920
STR EXP-SICK PAY-0200930    200930
STR EXP-MISC TIME PAID,NOT WORKED-0200990   200990

locations:

Lo.702-002
Lo.702-003
Lo.702-004
Lo.702-005

end result for EACH of the account numbers

STR EXP-VACATION ESTIMATE-0200900   200900 Lo.702-002
STR EXP-VACATION ESTIMATE-0200900   200900 Lo.702-003
STR EXP-VACATION ESTIMATE-0200900   200900 Lo.702-004
STR EXP-VACATION ESTIMATE-0200900   200900 Lo.702-005

the PHP code that would produce the results I want:

foreach($accounts as $name => $number) {
    foreach($locations as $location) {
        echo sprintf("%s,%s,%s\n", $name, $number, $location);
    }
}
Stephen K
  • 697
  • 9
  • 26
  • 1
    Can you include the top few rows of each data.frame? `dput(head(df1))` & `dput(head(df2))`. I'm thinking something using `merge` and `expand.grid` would do the trick. – maloneypatr Jan 27 '15 at 22:00
  • i can't - it's confidential to the company. the accountname will say something like 'delivery expense' and then the account number is '200999' and then another account name could be 'fees' and account name 200994. – Stephen K Jan 27 '15 at 22:01
  • Don't think about this as a for-each loop. You should be thinking about this as a merge. Even if you can't share the real data, you should still be able to create a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) using "similar" data. See the link for tips on how to do this. – MrFlick Jan 27 '15 at 22:02
  • Yeah, it's hard to see what you're going for then. Try displaying dummy data with a fake input and what you want as an output. – maloneypatr Jan 27 '15 at 22:02
  • ok, i added data to my original post – Stephen K Jan 27 '15 at 22:09
  • Gotcha, try `expand.grid(df1$name, df2)`. This only works on vectors and I just saw that the first data set contains two columns. – maloneypatr Jan 27 '15 at 22:23

2 Answers2

0

My solution:

acc.run <- function() {
  locFileName <- 'location-list.csv'
  accFileName <- 'account-list.csv'

  locations <- read.csv(locFileName, sep=',', quote='\"', header=T)

  accounts <- read.csv(accFileName, sep=',', quote='\"', header=T)

  #Add row numbers
  accounts$rowNum <- 1:nrow(accounts)

  merged <- merge(accounts, locations)

  sorted <- merged[order(merged$rowNum), ]

  final <- sorted[, !(names(sorted) %in% c('rowNum'))]

  # Random file extension to prevent duplicate/overwriting
  rExt <- paste(round(runif(6,10,100)), sep='', collapse='')

  write.csv(final, paste('accounts-concat', rExt, '.csv', sep='', collapse=''), row.names=F)
}

let me know how I can improve this?

Stephen K
  • 697
  • 9
  • 26
0

Here's an edited version of my original answer, modified to incorporate your test information. Does this meet your needs?

# Generate some usable test data
accounts <- read.csv(text = "
AccountName|AccountNumber
STR EXP-VACATION ESTIMATE-0200900|200900
STR EXP-HOLIDAY PAY-0200920|200920
STR EXP-SICK PAY-0200930|200930
STR EXP-MISC TIME PAID,NOT WORKED-0200990|200990
", sep = "|")

locations <- read.table(header = TRUE, text = "
Location
Lo.702-002
Lo.702-003
Lo.702-004
Lo.702-005
")$Location

# Combine the data into wide format
df <- cbind(accounts, locations = t(locations))
# Restructure the data in long format
reshape(df, varying = grep("locations", names(df)), direction = "long" )
Rick
  • 888
  • 8
  • 10