PNG File Reader

Pages: 12
I have been on a quest to write some kind of PNG file reader, mostly just for myself, and I was wondering if there is any good resources on how to do binary file reading. I've been mostly trying to figure it out myself but I am finding it difficult to do which im assuming is a gap in my knowledge mostly. My main issue right now is trying to search for specific chunks in a PNG file, any help in anyway is appreciated. I know that there probably isn't much practical use in writing a PNG reader but I've been mostly doing it out of curiosity and intrigue but I find it near impossible to find good resources on how to do it. You can judge me for picking up a silly project like this but my brain wants me to do this so please help a fellow programmer out. Thank you :)
Its unclear where you are in your adventure.
Its a *fine* project to learn something.

Where are you on these steps?
1) c++ knowledge to simply read and write a binary file. Can you do this, like a float, an int, a string, write it out and read it back in, in binary?
2) learn a hex editor. Can you open the file you wrote in 1) above and read it in a hex editor to verify it, and to some extent see what is in the file?
3) study the png file format. Do you know what that looks like? How big is the image, where are the pixels, what time was it taken, ... etc?
take a look at https://en.wikipedia.org/wiki/PNG#File_format to start

4) start to work on the project, using 1,2, and 3 to do something? Start simple ... maybe read an image in, swap say all the blue and red in the pixels, write it back out, and look at it in an image program to see that you did indeed mess it up?

related, but make sure you understand how byte order works, and what byte order the file is in. Sometimes you have to convert the values from one to another (big and little endian) yourself to get the correct numbers.

If I may make a suggestion, consider the old .tga (targa) format using an uncompressed image. This is possibly the most simple image format of all time, with very little beyond like the dimensions & the RGB array data. Png isn't rocket science like jpeg but its still nontrivial if its your first time. You may have to poke around for an editor that displays tga these days, but anything with a lot of formats will, while minimal stuff like ms paint may not.
Last edited on
Where are you on these steps?
1) c++ knowledge to simply read and write a binary file. Can you do this, like a float, an int, a string, write it out and read it back in, in binary?
2) learn a hex editor. Can you open the file you wrote in 1) above and read it in a hex editor to verify it, and to some extent see what is in the file?
3) study the png file format. Do you know what that looks like? How big is the image, where are the pixels, what time was it taken, ... etc?
4) start to work on the project, using 1,2, and 3 to do something? Start simple ... maybe read an image in, swap say all the blue and red in the pixels, write it back out, and look at it in an image program to see that you did indeed mess it up?

As of now I have prior knowledge of C++ and a bit about the fstream header, I understand how to open a file (in binary mode) extract bytes from the file.

I have been reading about the PNG file format for a little while now and I understand some of the basic format of what is in the file. I've been able to extract the IHDR chunk header using the fstream and gather various attributes of the PNG.

My main issue as of now is that I am not sure how to search the file for the specific chunk headers, as from what I've read the chunk headers (except for the IHDR and IEND chunks) can occur in any order. Any advice on how I would do that?
my advice is, DON'T :)

Read the file once. Process what you encounter as you encounter it, even if 'process' means 'hold it for later' or even 'discard'. If you need to process the chunks out of order from the file, then read them as you go and put them into the order you want them as you do that.

If you insist on looking for the headers, you can jump around, but I don't know the format. Eg if you get a header, do you now know the size of its record? Then you can jump past all that to the next record and get its header. Many binary files have headers with the sizes in them for just that reason, but you need to be very sure you jump to the right place each time. Any mistake and you are reading junk as if it were a header.
Last edited on
You will, in practice, never find chunks out of order.

Some chunks are unordered relative to each other — which is fine, it won’t affect your ability to read the image data.

The two main ways to read the file are:

❶ Read byte-by-byte using something like ifstream::get(). This is very easy to do and works just fine.

❷ Read the whole thing into a std::string or std::vector<std::byte> or something, then you can pick at the pieces at your leisure.

Both ways work just as well, and are ultimately equivalent, believe it or not (your reader can easily be written to work over both with maybe a single line of code different between the two).

DO make sure to open the file/stream in std::ios::binary mode!

To track where a chunk is just keep a std::vector<std::size_t> of offsets into the file/stream/memory as you encounter them. That way you don’t have to search around for any specific chunk more than once. The PNG reader doesn’t really need to do that, though.
The two main ways to read the file are:

❶ Read byte-by-byte using something like ifstream::get(). This is very easy to do and works just fine.

❷ Read the whole thing into a std::string or std::vector<std::byte> or something, then you can pick at the pieces at your leisure.

Both ways work just as well, and are ultimately equivalent, believe it or not (your reader can easily be written to work over both with maybe a single line of code different between the two).


Okay thank you ill try to figure these out and see how I do, another question though, with data in the std::byte vector if I want to store the data in a variable should I use a casting operator? Like for the width and height they are stored as 4 bytes should I cast each byte into a char and use an OR bitwise operation into an integer? I do know that I will have to switch the edianess of the values. (I've ran into that issue before haha) I hope I'm not asking too many questions, I've just been struggling to figure this stuff out the past week or 2 and well my mind wont let it go untill I do. I'm getting there I think...
yes, cast the bytes into the integer then use the built in tools for converting it to the other endian format. Intel processers have an instruction on the chip that does this very fast, and visual studio at least can tap it, rather than trying to DIY with swap operations.

you dont need to cast into a char at all, that has no purpose. Just cast the 4 bytes into the integer directly (reinterpret cast).

No offense but these are parts of 1 & 2 that I listed. Write a separate program that writes a couple of integers to a binary file and then read them out. Look at them in a hex editor. Edit the file and put one of them in with the reversed endian and then have your program correct it (it can be as simple as the number 1, which is like 00 00 00 01 vs 01 00 00 00). The value doesn't even matter .. pick something easy on the eyes like 00 ab cd ef and print it in hex, do the endian conversion, print it again...

I always keep a trash program around with a couple dozen of the most useful includes/modules so you can just try a few lines to figure out how something works before you try it in something big. Play with it, get it right, then use it.
Last edited on
It is quick and easy to build the integer from four bytes. Since all multibyte integers in a PNG are in network byte order, just build it as such. You can even cheat with a pointer to the first byte.

1
2
3
4
5
6
7
8
9
10
std::uint32_t
bytes_to_integer(
  std::uint8_t * bytes,
  int            n )
{
  std::uint32_t result = 0;
  while (n--)
    result = (result << 8) | *bytes++;
  return result;
}

If you are writing code to read from stream, you can do it like this, for example (this is not the only way, but one I tend to like):

1
2
3
4
5
6
7
8
template <typename IntType>
std::istream & read_integer( std::istream & f, IntType & value )
{
  std::uint8_t bytes[8];
  if (f.read( reinterpret_cast <char *> (bytes), sizeof(value) ))
    value = bytes_to_integer( bytes, sizeof(value) );
  return f;
}

Use it thus:

1
2
3
4
5
std::ifstream pngf( "my.png", std::ios::binary );
...
std::uint32_t chunk_length;
read_integer( pngf, chunk_length );
...

And so on.


You know, I always forget about why I dislike the automatic size selection. Don’t do that: be specific.

1
2
3
4
5
6
7
8
template <typename IntType>
std::istream & read_integer( std::istream & f, IntType & value, std::size_t n )
{
  std::uint8_t bytes[8];
  if (f.read( reinterpret_cast <char *> (bytes), n ))
    value = bytes_to_integer( bytes, n );
  return f;
}
1
2
3
4
5
std::ifstream pngf( "my.png", std::ios::binary );
...
std::uint32_t chunk_length;
read_integer( pngf, chunk_length, 4 );
...

Sorry to have given bad advice. Fixed.
Last edited on
Ah okay thank you for the help I will see what I can do :). I was mainly asking about the cast since I know that often times people say to avoid it but yeah it makes sense why you would in this case. thanks again I'll keep this stuff in mind for the future and try to work on some practice things first!
I apologize for somewhat going off topic vis-a-vis the OP's thread here. My multiple efforts to post this as a new thread under my own account name (which I had not used in a while) failed. That's why I'm using this means to get my very pressing question/problem out. Thx.

OK, please, consider this seemingly simple, but knotty problem.

A binary string of length n (a string containing only '0's and '1's) is constructed according to the following special rule: every occurrence of '0' in the string must appear as a consecutive block (all '0's together) of length that is a multiple of k. That is, the number of consecutive '0's must be k, 2k, 3k, ...
Given n and k, write a function that determines the total number of valid binary strings that can be formed.

For example:
if n = 3, k = 2; then the function returns 3 (since valid strings are: 111, 100, 001)
if n = 4, k = 2; then the function returns 5 (since valid strings are: 1111, 1001, 0011, 1100, 0000)

I wrote the code snippet below and obtained correct results for the two cases above.

1
2
3
4
5
6
7
8
9
10
11
int validStr(int n, int k) {
    int j {0};
    int h {1};
    while (h*k <= n){
        for (int i = 0; i < n - h*k + 1; ++i){
            ++j;
        }
        ++h;
    }
    return j + 1;
}

However, I got perturbed that my answer was deemed wrong when compared to the examiner's for some other cases. Such cases as I could easily manually prove I was right!
For instance, when n = 5 and k = 2, my code returns 7 (11111, 00111, 10011, 11001, 11100, 00001, 10000) but the examiner's (hidden) code says the answer is 8.
When n = 10 and k = 2, my code returns 26; the examiner's returns 89. My answer is so way off here!

I've concluded that there is something about the problem definition and/or interpretation I am not getting! Any ideas/insights?
the site is buggy, someone may know what to do about that for you. Try multiple browsers, or the classic or new interface (whatever you didn't use yet).
and ... riddle me this...
00100

I believe you can do this directly (just an equation with a couple of conditional modifiers for all zeros and all 1s). But all 1s always exists, so you just start your counting at 1 to account for that case. All zeros exists only if x*k == N, so you just add 1 if it exists. Otherwise, pretend K is a digit. eg for your case
111k
11k1
1k11
k111
11kk
1k1k
k11k
1kk1
kk11
1kkk
kkk1
...etc I dunno if I got all the combos but you get the idea. If you can arrange it so that the digit count is right (I bungled it here) it should be a formula...
Last edited on
@jonnin,

Thx for ur reply. Contrary to what u imply, k is not necessarily a binary digit but the number of consecutive zeros. Moreover, if k were to be a digit, some of ur entries (i.e. 1k1k, k11k) would not be valid since the 'k's in them are not consecutive. According to the question spec, if '0's appear in any string, they cannot appear in separated chunks, they have to be all bunched together.
For instance, when n = 5 and k = 2, my code returns 7 (11111, 00111, 10011, 11001, 11100, 00001, 10000) but the examiner's (hidden) code says the answer is 8.


Isn't 00100 also valid?
K obviously isn't a digit, but you can compute the number of variations using it as if it were, as long as you also adjust the max length vs number of Ks present and how many digits K contains (smallest). Its not trivial, but I think that path may lead you to an answer that isn't brute force. But it seems like a pen and paper problem --- draw up the patterns for k = 2,3, and 4 for N = 5,6,7 and figure out what the pattern is and from there, the formula that just provides the answer.

I was sloppy above, trying to type it out... its not right but the idea should bear fruit. Or if you have a better way, tell us. And keep trying to make a new topic too, this forum is dead enough that we can handle a bit of weirdness but you should figure it out if you want to hang with us.
Last edited on
@ seeplus,
Given the constraints of the problem, 00100 would not be a valid string because the zeros are not all consecutive; unless I am grossly misinterpreting the examiner's stipulated conditions in my OP above!

@ jonnin,
If I knew of a better way, I wouldn't have posted the question in the first place. What confounds me the most is the disparity in my answer vis-a-vis the examiner's for the n = 5, k = 2 case. Since the small data size allows this to be manually verified (as I showed), there is no way, given the stated constraints, that the answer is 8 and not 7 (as I got).

The n = 3, k = 2 and n = 4, k = 2 example cases were part of the problem statement. These definitely align with the problem's conditions; I'm still waiting for someone to prove how the n = 5, k = 2 results in 8 without violating the obvious problem constraints.
@geeloso
What is the exact wording of your assignment?
Because as you have explained it I see no error in your logic.
Hmmm... seems like I just had a brainwave and started to visualize the problem from a different angle! I would try to loosen the examiner's meaning of the stated constraint of "... consecutive block (all 0s together) ... " in the problem statement to mean the 0 appearances do not necessarily have to be a single substring. With this alleviating caveat, I would see what comes out at the end.
Yes, that is how I initially read it. Each “block” of  0 s must be indivisible, but the blocks themselves can be interspersed between the  1 s in any fashion.

When that is the case, then N=10,K=2 does indeed produce 89 possible strings.
Last edited on
BTW, if you don’t mind a little help, this is a mathematical question: What jonnin was trying to push you to get was that you can use the Binomial Coefficient to help you compute things. Oft times in CS classes, especially when dealing with n-ary data structures, you should be looking to see if there is a mathematical pattern.
@ Duthomhas

Thx for ur posts! I just read ur replies and hence, my apologies for not responding to the "exact wording of the assignment" u requested for. U might have read it earlier in my OP but, here it is again below with very little paraphrasing:

1
2
3
4
5
6
7
8
A binary string of length n (a string containing only '0's and '1's) is constructed according to the following special rule: 
every occurrence of '0' in the string must appear as a consecutive block (all '0's together) of length that is a multiple of k. 
That is, the number of consecutive '0's must be k, 2k, 3k, ...
Given n and k, write a function that determines the total number of valid binary strings that can be formed.

For example:
if n = 3, k = 2; then the function returns 3 (since valid strings are: 111, 100, 001)
if n = 4, k = 2; then the function returns 5 (since valid strings are: 1111, 1001, 0011, 1100, 0000)


Good that u seemed to comprehend right away what the examiner intended! I think the ambiguous nature of what he meant by consecutive blocks of 0s threw me off somewhat!
Last edited on
Pages: 12