2

I am working on a code that, among other things, must save a zero-filled file. It can expect to create a file from 10MB to even 1GB blank.

It must be similar to this Unix command:

dd if=/dev/zero of=my_image.dsk count=20480

I could do this one work with small sizes:

int totalSize = 1024*1024;
char buf[totalSize];
strcpy(buf,"");

NSData *theData = [NSData dataWithBytes:buf length:sizeof(buf)];

//NSLog(@"%@", theData);
NSLog(@"%lu", sizeof(buf));

[theData writeToFile:@"/Users/foo/Desktop/my_image.dsk" atomically:YES];

But if I try a bigger value (1024*1024*10, for instance), it crashes. So I tried:

NSFileManager *ddd = [[NSFileManager alloc ] init];
[ddd createFileAtPath:@"/Users/foo/Desktop/my_image.dsk" contents:[NSData data] attributes:nil];

It creates an empty file, but is not good because the file size is zero. It should not be zero.

I spent hours trying to find an answer without success. I want to do this in Obj-C, but C is also an option before I go nuts.

Please, someone give-me some light!

Thanks in advance!

-- Edit --

Thanks everyone, but one more thing: is it possible to write without allocating everything on memory?

Apollo
  • 1,913
  • 2
  • 19
  • 26
  • It crashes because you are allocating it on the stack. You have a limited amout of memory on the stack. You are trying to allocate 1 MB on the stack as it is, you are better using a malloc & memset call. – Richard J. Ross III Jan 16 '12 at 00:56
  • @RichardJ.RossIII: You should not `malloc` & `memset` in general, use `calloc` instead. You can save a lot of memory that way. – Dietrich Epp Jan 16 '12 at 01:11
  • @DietrichEpp Really? I've heard the other way around, and that memset is better than a calloc call, because it is inlined more? Just what I've heard, can't tell ya either way. – Richard J. Ross III Jan 16 '12 at 01:16
  • Also note that `-dataWithBytes:length:` copies the data sent to it, you would be better off using `-dataWithBytesNoCopy:length:` instead. – Richard J. Ross III Jan 16 '12 at 01:20
  • @DietrichEpp: How's that? I thought calloc just did the multiplication for you and bzeroed the allocated memory. – Chuck Jan 16 '12 at 01:35
  • @Chuck: It has the option of doing that, but it can also give you pre-zeroed memory or give you copy-on-write zeroed pages from the OS. For large allocations (1 MiB or larger), modern Unix-like systems almost invariable use copy-on-write so the allocation uses no memory except a little bit of overhead. You can test it, write a program that allocates 2x total RAM with `calloc`, compare it to a program that uses `malloc`/`memset`. The difference is ENORMOUS. (The `bzero` function is almost universally implemented as a call to `memset` these days.) – Dietrich Epp Jan 16 '12 at 02:02

4 Answers4

4

This is relatively simple in the POSIX API:

int f = open("filename", O_CREAT|O_EXCL, S_IRUSR|S_IWUSR);
if (f < 0) {
    perror("open filename");
    exit(1);
}

char empty = 0;

pwrite(f, &empty, sizeof(char), <offset into the file>);

The offset is specified using an integer of type off_t, which might not be large enough for your specific needs. Check your system API documentation to discover the 64-bit file interfaces if it isn't large enough.

The best part of this approach is that it only takes a few bytes of program memory -- you're not allocating 10 gigabytes of nothing to write your file.

Another even simpler approach is to use the truncate(2) system call. Not all platforms support extending the file with truncate(2), but for the ones that do, it's one call.

sarnold
  • 102,305
  • 22
  • 181
  • 238
  • 1
    Note that not all applications want to create sparse files. Might want to change permissions to 0666 and let the user set their own `umask`. – Dietrich Epp Jan 16 '12 at 01:09
  • Great, I was looking for something that does not allocate everything on memory, so this is a good start. But you lost me with this offset.. xD – Apollo Jan 16 '12 at 01:45
  • Extending the file with `truncate(2)` appears to be required by POSIX. – Hugh Jan 16 '12 at 01:48
  • @Huw: Nice; some day the older systems that didn't support some of POSIX's nicer features will be odd or obscure enough that we'll be able to clean up a lot of code like this. :) – sarnold Jan 16 '12 at 23:35
3

You can use dd. This will let you write, e.g., 10 GB without worrying about memory or address space.

NSTask *task = [NSTask new];
long size = ...; // Note! dd multiplies this by 512 by default
NSString *path = ...;
[task setLaunchPath:@"/bin/dd"];
[task setArguments:[NSArray arrayWithObjects:@"dd", @"if=/dev/zero",
    [NSString stringWithFormat:@"of=%s", [path fileSystemRepresentation]],
    [NSString stringWithFormat:@"count=%ld", size],
    nil]];
[task launch];
[task waitUntilExit];
if ([task terminationStatus] != 0) {
    // an error occurred...
}
[task release];

An advantage is that sophisticated users can kill dd if something goes wrong.

About sandboxing: My suspicion is that this will work fine even in a sandbox, but I'd be curious to know for sure. According to documentation (link), a subprocess "simply inherits the sandbox of the process that created it." This is exactly how you'd expect it to work on Unix.

You can be sure that dd will stick around since Apple claims that OS X conforms to SUSv3.

Dietrich Epp
  • 205,541
  • 37
  • 345
  • 415
  • Interesting.. do you think it would be too much work implement a "kill" action on error by default? – Apollo Jan 16 '12 at 01:47
  • Got an error: `Program received signal: "SIGABRT"` and in the log: `2012-01-16 01:56:05.977 File handler[2448:903] *** Terminating app due to uncaught exception 'NSInvalidArgumentException', reason: 'launch path not accessible' *** Call stack at first throw:` – Apollo Jan 16 '12 at 02:00
  • @GiancarloMariot: As written, errors are detected when `dd` exits, so there is no need to kill `dd`. The `dd` program will normally exit immediately upon encountering an error. – Dietrich Epp Jan 16 '12 at 02:04
  • @GiancarloMariot: Looks like you need to add `setLaunchPath:`. I've changed the code above. – Dietrich Epp Jan 16 '12 at 02:05
  • Thanks Dietrich.. Best option so far. I'll check all again in the morning (2am, GMT 0 here).. – Apollo Jan 16 '12 at 02:31
  • Definitely nice, but it does code in requirements that `dd` exist, is runnable by the current user and process (OS X does have sandbox environments available that can prevent it), and that it exist at `/bin/dd`. But ten points for making something really easy. :) – sarnold Jan 16 '12 at 23:38
  • @sarnold: `dd` is guaranteed to exist according to Mac OS X documentation, which states that Mac OS X conforms to SUS3. (The code also assumes that the `NSTask` class exists, that `/dev/zero` and `Foundation.framework` exist, and other such things.) There exists no user unable to run `dd` nor is such a situation forseeable. I don't know how sandbox entitlements get inherited by `NSTask` objects, so that is the only real issue. – Dietrich Epp Jan 17 '12 at 00:23
  • Note that you can tell which programs are executable by arbitrary users by looking at their paths: everything in `/bin` is runnable by everyone, programs in `/sbin` often require privileges. – Dietrich Epp Jan 17 '12 at 00:26
  • Thanks for the link to the sandboxing guide, that's superb. The last time I tried to find out more information about it, I found it remarkably underdocumented. – sarnold Jan 18 '12 at 22:13
1

Try this code:

int totalSize = 1024*1024;
// don't allocate on the stack
char *buf = calloc(totalSize, sizeof(char));

NSData *theData = [NSData dataWithBytesNoCopy:buf length:totalSize];

//NSLog(@"%@", theData);
NSLog(@"%lu", totalSize);

[theData writeToFile:@"/Users/foo/Desktop/my_image.dsk" atomically:YES];

You could also try this, less of a memory hog, and it works with a 1 GB file.

void writeBlankBytes(FILE *file, size_t numKB)
{
    static char *oneKBZero = NULL;

    if (oneKBZero == NULL)
    {
        oneKBZero = calloc(1024, sizeof(char));
    }

    for (int i = 0; i < numKB; i++) {
        fwrite(oneKBZero, 1, 1024, file);
    }
}
Richard J. Ross III
  • 55,009
  • 24
  • 135
  • 201
  • What happens to this approach as `totalSize` approaches 10GB? – aroth Jan 16 '12 at 01:03
  • Well then you are screwed, unless for some ungodly reason you have more than 10 GBs of RAM. – Richard J. Ross III Jan 16 '12 at 01:04
  • @RichardJ.RossIII: If you used `calloc` instead of `malloc`/`memset` (which you always *should*), you could be fine on 64-bit computers even if they don't have much RAM. – Dietrich Epp Jan 16 '12 at 01:07
  • 1
    @RichardJ.RossIII: `sizeof(buf)` is the same as `sizeof(char *)`, which is 4 or 8 on most systems. You probably mean `totalSize`. Note that `sizeof(char) == 1` is explicitly guaranteed by the C standard, so it can be omitted. – Dietrich Epp Jan 16 '12 at 04:27
1

Thanks everyone. I came to a solution. This is basically a sketch of the code that I am about to put in a cocoa app.

Works fine and I think I only need some precautions with the size variable and it will be enough.

long size = 2000*500;

NSString *path = [[NSString alloc] initWithFormat: @"/Users/foo/Desktop/testeFile.dsk"];

NSTask *task = [[NSTask alloc] init];
[task setLaunchPath:@"/bin/dd"];
[task setArguments:[NSArray arrayWithObjects:
                    @"if=/dev/zero",
                    [NSString stringWithFormat:@"of=%s", [path fileSystemRepresentation]],
                    [NSString stringWithFormat:@"count=%ld", size],
                    nil]];
NSLog(@"%@",[task arguments]);
[task launch];

//[task waitUntilExit]; //The app would stuck here.. better use the loop

int cont = 0;
while ([task isRunning]) {
    cont++;
    if(cont%100==0) NSLog(@". %d", cont);
}

if ([task terminationStatus] != 0) {
    NSLog(@"an error occurred...");
}
[task release];
[path release];

NSLog(@"Done!");
Apollo
  • 1,913
  • 2
  • 19
  • 26
  • Do note that this places a dependence upon `dd` being available and specifically located in `/bin/dd` -- if either requirement is not met your program will fail. While this will probably work for you for years, I think simpler methods are available that won't have these dependencies. – sarnold Jan 16 '12 at 23:34
  • @sarnold I wish I could find a good method as you say, but all of them throw the file to the memory before saving and if a user asks for a 1GB file, it will probably crash the application or at least become an elephant. I've been looking everywhere for a simple solution, then I got here and all the solutions I found are these you can see here.. =( If you have a new suggestion, it would be great! =) – Apollo Jan 18 '12 at 17:39
  • [Dietrich's answer](http://stackoverflow.com/a/8874811/377270) is good; the chances of running into the `NPROC` `setrlimit(2)` constraint against starting new processes is pretty low and the link to the sandboxing guide he provided indicates to me that you, as the application author, rather than the _user_, is responsible for providing the sandbox, so that's not likely to be a real impediment to your application. My suggestion _does_ create a sparse file (which may not be what you want) and it requires working at a different API level. – sarnold Jan 18 '12 at 22:17