PDFsharp - moved to http://forum.pdfsharp.net/ Forum Index PDFsharp - moved to http://forum.pdfsharp.net/
Please visit the new PDFsharp forum at http://forum.pdfsharp.net/
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Modifying stream data for a page

 
This forum is locked: you cannot post, reply to, or edit topics.   This topic is locked: you cannot edit posts or make replies.    PDFsharp - moved to http://forum.pdfsharp.net/ Forum Index -> Support - moved to http://forum.pdfsharp.net/
View previous topic :: View next topic  
Author Message
bvsoftware



Joined: 08 Nov 2006
Posts: 2

PostPosted: Wed Nov 08, 2006 4:23 pm    Post subject: Modifying stream data for a page Reply with quote

Does anyone have code samples of how to modify the stream contents of a page?

I can get the stream decoded and modified but I can't seem to locate the "/Filter" element in order to encode the modified instructions.
_________________
http://www.bvsoftware.com - ASP.NET Cart
http://www.dotnetBB.com - ASP.NET Forum
http://www.SQLFindAndReplace.com - Utility
Back to top
View user's profile Send private message
bvsoftware



Joined: 08 Nov 2006
Posts: 2

PostPosted: Wed Nov 08, 2006 9:55 pm    Post subject: Figured out part of it. Reply with quote

I've figured out the filter part now. Instead of pulling the single page content stream I'm walking all the elements in the PDF file and looking for PDFDictionary objects. I check to see if they have a stream and then I can get the /Filter value. I had to encode my modified stream and now it shows up in the modified PDF.

I am running into one more issue. InDesign sends text as hex encoded. It looks like this:

<0044>Tj

Which appears as the letter "a" but I haven't figured out the encoding yet because it doesn't match up to the standard ASCII or unicode values for "a"
_________________
http://www.bvsoftware.com - ASP.NET Cart
http://www.dotnetBB.com - ASP.NET Forum
http://www.SQLFindAndReplace.com - Utility
Back to top
View user's profile Send private message
Stefan Lange



Joined: 12 Oct 2006
Posts: 47
Location: Cologne, Germany

PostPosted: Thu Nov 09, 2006 1:26 pm    Post subject: Reply with quote

PDFsharp 0.9 contains an early implementation of PdfSharp.Pdf.Content. ContentReader that converts a content stream in a sequence of instances of objects derived form CObject. My current code also has a ContentWriter that converts the objects back to a content stream.

This works fine, but it is just the beginning of the problem. The meaning of
Code:
<0044>Tj
can be determined only with the font that is used. PDF has no native support for Unicode. For using Unicode a so called CID (character ID) font must be derived from the underlying TrueType font. 0x0044 is NOT the Unicode character but the glyph id within the TrueType font the CID font is based on. Confused To reverse lookup which character corresponds to 0x0044 you must use the ToUnicodeMap. It maps glyph IDs to Unicode characters. In general glyph IDs and Unicode characters are not the same values.

To save space tools like InDesign typically embed only a subset of the Unicode fonts in the PDF file. The subset only contains the glyphs used in your document. To make the structure of the internal tables of this subset font easier, the glyphs are renumbered when the subset font is created. Without the corresponding ToUnicodeMap even Acrobat cannot 'read' your Unicode text anymore (i.e. it cannot copy selected text to the clipboard), even if you can read the text very well because you interpret the stroked glyphs…

Best you can do is to embed the whole font. Then the glyph IDs and Unicode values partially match with an offset. With the help of the ToUnicodeMap you can encode or decode the text.

Regards
Stefan Lange
Back to top
View user's profile Send private message Send e-mail
Display posts from previous:   
This forum is locked: you cannot post, reply to, or edit topics.   This topic is locked: you cannot edit posts or make replies.    PDFsharp - moved to http://forum.pdfsharp.net/ Forum Index -> Support - moved to http://forum.pdfsharp.net/ All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group. Hosted by phpBB.BizHat.com


Start Your Own YouTube Clone

Free Web Hosting | Free Forum Hosting | FlashWebHost.com | Image Hosting | Photo Gallery | FreeMarriage.com

Powered by PhpBBweb.com, setup your forum now!
For Support, visit Forums.BizHat.com