Jump to content

Welcome to Geeks to Go - Register now for FREE

Need help with your computer or device? Want to learn new tech skills? You're in the right place!
Geeks to Go is a friendly community of tech experts who can solve any problem you have. Just create a free account and post your question. Our volunteers will reply quickly and guide you through the steps. Don't let tech troubles stop you. Join Geeks to Go now and get the support you need!

How it Works Create Account
Photo

JFIF Image class - Java


  • Please log in to reply

#31
bdlt

bdlt

    Member

  • Member
  • PipPipPip
  • 876 posts
good job finding another reference

reply to post #29:

it looks like there will be 3 input arrays and 3 output arrays at each step(YCbCr)

.........................................................................

more reply to post #29:

s = s + (alpha(p,q) * double(I(p,q)) * cosines(p,m) * cosines(q,n));

alpha and cosines are arrays defined in the sample code

double() is casting the input (I) from an int to a double(we have 3 I for each pixel Y, Cr, Cb )

p, m, & n are loop variables

............................................................................

we will do 3 operations on each pixel - one each for Y, Cr , and Cb

we will do 3 operations on each pixels at each step

................................................................................
.....
  • 0

Advertisements


#32
staticVoid

staticVoid

    Member

  • Topic Starter
  • Member
  • PipPip
  • 94 posts
I think I understand:

Y - first block of 8x8
Cb - second block of 8x8
Cr - third block of 8x8

Y - fourth block of 8x8
Cb - fifth block of 8x8
Cr - sixth block of 8x8

and so on...

so basically three 8x8 blocks of data will represent 64 pixels?

then each value(signed) is passed through the DCT(for compression)

but http://en.wikipedia.org/wiki/JPEG and http://www.opennet.r...ormats/jpeg.txt list the steps in different order?

what one is correct , or does it matter?

Edited by staticVoid, 24 August 2007 - 01:42 AM.

  • 0

#33
bdlt

bdlt

    Member

  • Member
  • PipPipPip
  • 876 posts
yes - there will be 3 8x8 blocks of data for 64 pixels

I noticed the order was different, too. Unfortunately, it does matter. I 'glanced' through the standard yesterday looking for an answer ... didn't find it. It looks like we will need another reference or 2 before we are done.

The decompression order matters because it HAS to be the same as the compression order. I'm guessing that the 'wiki' order is correct, but that is just speculation.

We will be pouring through specs for the next few weeks and eventually will see a consensus on the order. Or once all of the steps are coded, we can simply try each sequence. If the code is well written, that will be easy - swap 2 lines of code.
  • 0

#34
staticVoid

staticVoid

    Member

  • Topic Starter
  • Member
  • PipPip
  • 94 posts
In reply to post #22:

You said the raw image data will be sequential , which I understand but would It not be much more simple to make the first 64 integers of the image data the first 8x8 block and so on. say I had a 16x16 image (to simplify things):

(1st block)[1, 2, 3, 4... 63, 64]
(2nd block)[65, 66, 67, 68.... 127, 128]
(3rd block)[129, 130, 131, 132.... 191, 192]
(4th block)[193, 194, 195, 196.... 255, 256]
(5th block)[257, 258, 259, 260.... 319, 320]


and so on...?
  • 0

#35
bdlt

bdlt

    Member

  • Member
  • PipPipPip
  • 876 posts
yes, that would make it a lot easier if we filled each block sequentially.

unfortunately, when images are compressed, the 8x8 blocks are not filled sequentially.

we must 'undo' each compression step in the reverse order that the compression was done.

your approach of understanding compression is sound. each compression step has to be reversed. we will have to rely on 'good' references to know what was done during compression, then write code to undo each compression step.
  • 0

#36
staticVoid

staticVoid

    Member

  • Topic Starter
  • Member
  • PipPip
  • 94 posts
I'm currently tweaking each class to compress and decompress, the classes I've completed are:

Transform Color Space - complete
DownSampling - complete but not sure
LevelShift - complete
CosineTransformation - incomplete
Zig Zag Reorder - complete
Quantization - complete
Run Length Encoding - not attempted
Huffman - not Attempted

In the Downsampling Ive just basically returned:

Y=Y
Cb = Cb/2
Cr = Cr/2

is this correct?

I could attach a zip with all the class sources in my next post?
  • 0

#37
bdlt

bdlt

    Member

  • Member
  • PipPipPip
  • 876 posts
I believe the down sampling factors depend on how the image was compressed.

that info is in the header.

look at SOFO below

http://www.obrador.c.../HeaderInfo.htm

yes - attach the current code

Edited by bdlt, 24 August 2007 - 04:05 PM.

  • 0

#38
staticVoid

staticVoid

    Member

  • Topic Starter
  • Member
  • PipPip
  • 94 posts


I know the quantization table is defined in the quantization marker , but I just put that one in as an example

oh, and I finally found out how to separate the image data from the markers - the image data immediatly follows the start of scan marker(see jpg.java)

The Test class can be ignored , I was just using it to test.

Edited by staticVoid, 25 August 2007 - 05:16 AM.

  • 0

#39
bdlt

bdlt

    Member

  • Member
  • PipPipPip
  • 876 posts
research notes:

ISO 10918-1 and wiki agree on the encoder order(quantization, then zigzag)

revised step sequence for jpeg.txt

COMPRESSION
1) [R G B] -> [Y Cb Cr]
2) down sampling
3) level shift
4) DCT
5) quantization
6) zigzag
7) zero run length coding
8) huffman encoding

DECOMPRESSION
8) huffman decoding
7) zero run length de-coding ... no action required
6) undo zigzag
5) de-quantization
4) IDCT
3) undo level shift
2) up sampling
1) [Y Cb Cr] -> [R G B]
  • 0

#40
staticVoid

staticVoid

    Member

  • Topic Starter
  • Member
  • PipPip
  • 94 posts
thanx

In reply to post #22 again

The Last stage of compression(Huffman) says that there will be a stream of bits output to the jpg file representing the whole 64 coefficient vector [DC , 63 x AC] for each vector.

http://www.opennet.r...ormats/jpeg.txt

how can this be processed if the data is separated over the image file?

Edited by staticVoid, 25 August 2007 - 07:10 AM.

  • 0

Advertisements


#41
bdlt

bdlt

    Member

  • Member
  • PipPipPip
  • 876 posts
I'm not quite sure I understand your question.

here's a brief overview of the jpeg file:

the image data is represented by the huffman code

during compression the huffman code is written to the jpeg file

when a file is decompressed the image data available will be found in the file as huffman code

huffman code is the compressed version of the raw pixel data

look at the explanation of EOB in step 7 of jpeg.txt(there are most likely thousands of EOBs in each image). the explanation may help you visualize huffman 'blocks' of data.

I will be out the rest of today and will look for your reply late tonight.
  • 0

#42
staticVoid

staticVoid

    Member

  • Topic Starter
  • Member
  • PipPip
  • 94 posts
I understand the huffman code represents two values - one for the pre-occurences of 0 and the other for the number of bits used to store the value in the block(if not zero)

eg.

111000 111111

It says 111000(56) is the huffman code for (0, 6) , how do you extract these two values from the huffman code?

Heres how I would go about decompressing the file: (correct me If I'm wrong anywhere)

1. retrieve all information from markers(quantization table etc.)
2. isolate image data
3. save all data before EOB (0, 0)
4. decompress data to hopefully get an array of 64 integers
5. repeat step 3 and 4 , two times to retrieve Cb, Cr arrays
6. convert each value in Y, Cb, Cr to R, G, B - ( R = Y[0].convert() , G = Cb[0].convert(), B = Cr[0].convert() )
7. draw thefirst 8x8(64 pixels) block in top left corner of container
8. repeat process from step 3 for all of the image data

thats my understanding of how I would do It , If I'm doing anything wrong plz tell me
  • 0

#43
bdlt

bdlt

    Member

  • Member
  • PipPipPip
  • 876 posts
the decompression steps listed in your previous post looks like a good start - we may add to it as we go.
................................................................................
..
general approach to decoding huffman code:
instead of parsing bytes, huffman decoding requires parsing of bits.

consider a couple of possible approaches:
1. convert the huffman bits into 0's and 1's and parse using StringBuffer(s)
2. convert the huffman bits into decimals and parse using ints

note: avoid using String objects inside loops - use StringBuffers instead. String objects will slow the process to a crawl.
................................................................................
..
I will look for a good example of huffman decoding

Edited by bdlt, 26 August 2007 - 01:25 PM.

  • 0

#44
bdlt

bdlt

    Member

  • Member
  • PipPipPip
  • 876 posts
1111110 000000000 111000 111001 111000 101101 1111111110011001 10111

11111110110 00001 1011 0111 11011 1 1010

.................................................


the bit pattern above is from jpeg.txt


unfortunately, I believe there are errors in the explanation in jpeg.txt.

.................................................

1111110 000000000

these are the DC category code(1111110) and the DC value(000000000)

look at table K3 page 149 of the iso spec
find 1111110 - which corresponds to category 9
grab the next 9 bits - 000000000
look at table ??
find 000000000 - which corresponds to -511

.................................................
111000 111001
error #1
K3(used above) is the table for DC luminance
we expect the first AC category to be found in the AC luminance table(K5 page 150)
111000 can not be found in K5, but we find it in K6 AC chrominance
we will ignore the error and parse 111000 111001
from K6 111000 is 0/6(run/size)
grab the next 6 bits 111001, which = 57

.................................................
111000 101101
error #2 same as error #1 - 111000 is found in K6
we will ignore the error and parse 111000 101101
from K6 111000 is 0/6(run/size)
grab the next 6 bits 101101, which = 45

.................................................
1111111110011001 10111
from K6 1111111110011001 is 4/5
grab the next 5 bits 10111, which = 23

.................................................
11111110110 00001
from K6 11111110110 is 1/5
grab the next 5 bits 00001, which = -30

.................................................
1011 0111
from K5 1011 is 0/4(we ignore the K5/K6 inconsistancy)
grab the next 4 bits 0111, which = -8
.................................................
11011 1
from K6 11011 is 3/1
grab 1 bit, which = 1
...............................................
1010 - this is the EOB marker(end of block)
.............................................................
the 64 values are 57 45 0 0 0 0 23 0 -30 -8 0 0 0 1 ... the rest are 0

.................................................

I hope the K5/K6 issue doesn't confused things too much.
I hope there are no errors in this post.
  • 0

#45
staticVoid

staticVoid

    Member

  • Topic Starter
  • Member
  • PipPip
  • 94 posts
I found A really good reference to Huffman coding:

http://www.cs.duke.e...poop/huff/info/

I think I'll be able to understand it from that , It's just the dct/idct im confused about, every explanation of it uses words that require definition (and in the definition are words that need defined). It's like a giant JTree. By the time I get Through it all I'll be a professional physicist. there must be a step by step explination to it somewhere?

I followed the DCT equation and attempted to implement it but the output array has weird values:
the first value is -1024.8967320256231 when it should be -415 and the rest is NaN?


public class CosineTransformation {

   public CosineTransformation() {
   }

   public double[][] F = new double[8][8];
   public double Temp = 0.0;
   public double c = 0.0;

   public double[][] DCT(double[][] f) {
	  for(int u = 0; u < 8; u++) {
		 for(int v = 0; v < 8; v++) {
			if(v == 0 && u == 0) {
			   c = 0.5;
			} else {
			   c = 1.0; 
			}
			for(int x = 0; x < 8; x++) {
			   for(int y = 0; y < 8; y++) {
				  Temp += f[x][y] * Math.acos( (Math.PI/8) * (x + 0.5) * u )
										   * Math.acos( (Math.PI/8) * (y + 0.5) * v );
			   }
			}
			F[u][v] =  (c/4.0) * Temp;
			Temp = 0.0;
		 }
	  }
	  return F;
   }
   public void IDCT(int channel) {
   }
}

Edited by staticVoid, 27 August 2007 - 08:09 AM.

  • 0






Similar Topics

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users

As Featured On:

Microsoft Yahoo BBC MSN PC Magazine Washington Post HP