使用 Quartz 从 pdf 中解析多媒体

发布于 2024-10-16 00:27:44 字数 9870 浏览 8 评论 0原文

我一直在尝试解析引用 pdf 文件上的文件的屏幕注释,但我无法以某种方式控制流。应该存在的东西,例如 EmbeddedFiles 数组或 Dests 字典,不存在。我使用与 reader 6 或更高版本兼容的 adobe acrobat pro 9 生成了 pdf,以避免 flash 视频强制转换。

这是我正在使用的测试 pdf。

代码片段解析了这个坏男孩是

    CGPDFStringRef aTitle; // title is optional property. 
    if (!CGPDFDictionaryGetString(annotDict, "T", &aTitle)) {
#if DEBUG
        NSLog(@"PDFScrollView parseScreen -> screen annot has no title");
#endif
        //return;
    }else {
#if DEBUG
    char *screenTitle = (char*)CGPDFStringGetBytePtr(aTitle);

    NSLog(@"PDFScrollView parseScreen -> screen  title %s",screenTitle);
#endif
    }

    // get action
    CGPDFDictionaryRef actionDict;
    if(!CGPDFDictionaryGetDictionary(annotDict, "A", &actionDict)) {
        return;
    }

    // parse action

    const char* name;

    if (!CGPDFDictionaryGetName(actionDict, "S", &name)) {
#if DEBUG
        NSLog(@"PDFScrollView parseScreen -> screen annot has name attrib");
#endif
        return;
    }

    NSString *actionType = [[NSString alloc] initWithCString:name];

    if (![actionType isEqualToString:RENDITION_ACTION_TYPE]) {
#if DEBUG
        NSLog(@"PDFScrollView parseScreen -> screen annot action is not rendition");
#endif
        return;
    }
    [actionType release];
    actionType = nil;
    // get the rendition from the action dictionary
    CGPDFDictionaryRef renditionDict;
    if (!CGPDFDictionaryGetDictionary(actionDict, "R", &renditionDict)) {
#if DEBUG
        NSLog(@"PDFScrollView parseScreen -> rendition action does not have rendition");
#endif      
        return;
    }
    // check if the rendition is media or selector
    const char *renditionType;
    if (!CGPDFDictionaryGetName(renditionDict, "S", &renditionType)) {
#if DEBUG
        NSLog(@"PDFScrollView parseScreen -> rendition does not have type");
#endif  
        return;
    }
    // check rendition type
#if DEBUG
    NSLog( @"rendition type %s",renditionType);
#endif
    NSString *rendTypeString = [[NSString alloc]initWithCString:renditionType];
    if (![rendTypeString isEqualToString:@"MR"]) {
#if DEBUG
        NSLog( @"PDFScrollView parseScreen ->rendition type is not  MR --> %s",renditionType);
#endif
        return;

    }
    [rendTypeString release];
    rendTypeString =nil;
    // get media clip dictionary

    CGPDFDictionaryRef mediaclipDict;
    if (!CGPDFDictionaryGetDictionary(renditionDict, "C", &mediaclipDict)) {
#if DEBUG
        NSLog( @"PDFScrollView parseScreen ->rendition dictionary does not contain clip");
#endif
        return;
    }


    const char * mediaClipType;
    if (!CGPDFDictionaryGetName(mediaclipDict, "Type", &mediaClipType)) { // optional
#if DEBUG
        NSLog( @"PDFScrollView parseScreen ->media clip dictionary does not contain name");
#endif


    } else {
#if DEBUG
        NSLog(@"PDFScrollView parseScreen ->media clip object dictionary name %s",mediaClipType);
#endif
    }

    char const *mediaClipSubtype;
    if (!CGPDFDictionaryGetName(mediaclipDict, "S", &mediaClipSubtype)) { // required
#if DEBUG
        NSLog( @"PDFScrollView parseScreen ->media clip dictionary does not contain name");
#endif

        return;
    } 

    NSString *mediaClipSubtypeString = [[NSString alloc] initWithCString:mediaClipSubtype];
    if (![mediaClipSubtypeString isEqualToString:@"MCD"]) {
#if DEBUG
        NSLog( @"PDFScrollView parseScreen ->media clip subtype is not MCD ==>%@",mediaClipSubtype);
#endif      
        return;
    }

    [mediaClipSubtypeString release];
    mediaClipSubtype = nil;

    // get media clip name

    CGPDFStringRef mediaClipName;
    if (!CGPDFDictionaryGetString (mediaclipDict, "N", &mediaClipName)) { // optional
#if DEBUG
        NSLog( @"PDFScrollView parseScreen ->media clip dictionary does not contain name");
#endif

    } else {
#if DEBUG
        NSLog(@"PDFScrollView parseScreen ->media clip object dictionary name %s",mediaClipName);
#endif
    }

    // get ASCII MIME type
    CGPDFStringRef mimeType;
    if(!CGPDFDictionaryGetString(mediaclipDict, "CT", &mimeType)) {

#if DEBUG
        NSLog(@"PDFScrollView parseScreen ->media clip object does not contain mime type");
#endif
    }else {
#if DEBUG
        NSLog(@"PDFScrollView parseScreen ->media clip object mime type %s",CGPDFStringGetBytePtr(mimeType));
#endif
    }


    // get content stream

    CGPDFDictionaryRef contentDict ;
    if (!CGPDFDictionaryGetDictionary(mediaclipDict, "D", &contentDict)) {
#if DEBUG
        NSLog(@"PDFScrollView parseScreen ->media clip object does not contain content dict");

#endif
        return;
    }

    //check content type
    const char *contentType;

    if (!CGPDFDictionaryGetName(contentDict, "Type", &contentType)) {
#if DEBUG
        NSLog(@"PDFScrollView parseScreen ->content dict does not have type");

#endif      
    }else {

#if DEBUG
        NSLog(@"PDFScrollView parseScreen ->content dict type %s",contentType);

#endif
    }
    // get file system
    const char *fileSystem;
    if (!CGPDFDictionaryGetName(contentDict, "FS", &fileSystem)) {
#if DEBUG
        NSLog(@"PDFScrollView parseScreen ->content dict does not have type");

#endif      
    }else {

#if DEBUG
        NSLog(@"PDFScrollView parseScreen ->content dict type %s",fileSystem    );

#endif
    }

#if DEBUG 
    CGPDFStringRef description;
    if (!CGPDFDictionaryGetString (contentDict, "UF", &description)) {

        NSLog(@"PDFScrollView parseScreen ->content dict does not have type");


    }else {
        NSLog(@"PDFSCrollView parseScreen -> contentdict UF %s",CGPDFStringGetBytePtr(description));
    }

#endif
    // check whether it is a file specification
    if (strcmp(contentType, "Filespec")!=0) { // it is Filespec
#if DEBUG
        NSLog(@"PDFScrollView parseScreen ->content dict type %s is not file specification",contentType);

#endif
        return;
    }
    CGPDFStringRef fstring;// I get the file title 
    if (!CGPDFDictionaryGetString (contentDict, "F", &fstring)) {
#if DEBUG
        NSLog(@"PDFScrollView parseScreen ->content dict does not have F string");

#endif      
    }

#if DEBUG
        NSLog(@"PDFScrollView parseScreen ->content f string %s",CGPDFStringGetBytePtr(fstring));

#endif
    CGPDFStreamRef str;// here there's no stream at all
    if (!CGPDFDictionaryGetStream  (contentDict, "F", &str)) {
#if DEBUG
        NSLog(@"PDFScrollView parseScreen ->content dict does not have F string");

#endif      
    }

    // reference file
    CGPDFArrayRef referencedFileDict ;
    /*** does not find the RF ****/
    if (!CGPDFDictionaryGetArray (contentDict, "RF", &referencedFileDict)) {
#if DEBUG
        NSLog(@"PDFScrollView parseScreen ->content dict does not contain Referenced file dictionary RF");

#endif  
    }


    CGPDFDictionaryRef embeddedFileDict ;

    if (!CGPDFDictionaryGetDictionary(contentDict, "EF", &embeddedFileDict)) {
#if DEBUG
        NSLog(@"PDFScrollView parseScreen ->content dict does not contain embedded file dictionary EF");

#endif  
    }
    // EF is found sucessfully

    CGPDFDictionaryRef documentCatalog = CGPDFDocumentGetCatalog(_docRef);

    CGPDFDictionaryRef namesDict;


    if (!CGPDFDictionaryGetDictionary (documentCatalog, "Names", &namesDict)) { // optional
#if DEBUG
        NSLog( @"PDFScrollView parseScreen ->documentCatalog does not contain Names dict");
#endif
        //return;

    } 

    CGPDFDictionaryRef destsDict;


    if (!CGPDFDictionaryGetDictionary (documentCatalog, "Dests", &destsDict)) { // optional
#if DEBUG
        NSLog( @"PDFScrollView parseScreen ->documentCatalog does not contain Names dict");
#endif
        //return;

    } 


    CGPDFDictionaryRef embeddedFilesDict ;

    if (!CGPDFDictionaryGetDictionary(namesDict, "EmbeddedFiles", &embeddedFilesDict)) {
#if DEBUG
        NSLog( @"PDFScrollView parseScreen ->namesDict does not contain embeddedFiles dict");
#endif
        return;

    }

    CGPDFArrayRef namesArray;
    if (!CGPDFDictionaryGetArray(namesDict, "Names", &namesArray)) {
#if DEBUG
        NSLog( @"PDFScrollView parseScreen ->names Dict does not contain Names");
#endif
        return;

    }

控制台输出

2011-02-05 01:59:35.324 xxxxxxxxxx[62350:207] -> parseLink annotation subtype Screen
2011-02-05 01:59:35.325 xxxxxxxxxx[62350:207]  parseScreen -> screen  title Annotation from Inception_HD.avi
2011-02-05 01:59:35.325 xxxxxxxxxx[62350:207] rendition type MR
2011-02-05 01:59:35.326 xxxxxxxxxx[62350:207]  parseScreen ->media clip dictionary does not contain name
2011-02-05 01:59:35.327 xxxxxxxxxx[62350:207]  parseScreen ->media clip object dictionary name 
2011-02-05 01:59:35.327 xxxxxxxxxx[62350:207]  parseScreen ->media clip object mime type video/avi
Current language:  auto; currently objective-c
2011-02-05 01:59:39.080 xxxxxxxxxx[62350:207]  parseScreen ->content dict type Filespec
2011-02-05 01:59:40.206 xxxxxxxxxx[62350:207]  parseScreen ->content dict does not have type
2011-02-05 01:59:41.647 xxxxxxxxxx[62350:207]  parseScreen -> contentdict UF Inception_HD.avi
2011-02-05 01:59:44.234 xxxxxxxxxx[62350:207]  parseScreen ->content f string Inception_HD.avi
2011-02-05 01:59:45.472 xxxxxxxxxx[62350:207]  parseScreen ->content dict does not have F string
2011-02-05 01:59:47.772 xxxxxxxxxx[62350:207]  parseScreen ->content dict does not contain Referenced file dictionary RF
(gdb) continue
2011-02-05 03:33:13.748 xxxxxxxxxx[62350:207]  parseScreen ->documentCatalog does not contain Names dict

我的想法是获取文件流,将其保存到临时文件并像 adobe 开发人员使用 adobe SDK 那样播放它。

提前致谢

I've been trying to parse an screen annotation that references a file on a pdf file, and i'm not being able to somehow get a grip on the stream. Things that should be there, like the EmbeddedFiles array or the Dests dictionary are not there. I generated the pdf with adobe acrobat pro 9 with compatibility with reader 6 or higher in order to avoid the flash video mandatory conversion.

this is the test pdf I'm using.

The code snipped parsing this bad boy is

    CGPDFStringRef aTitle; // title is optional property. 
    if (!CGPDFDictionaryGetString(annotDict, "T", &aTitle)) {
#if DEBUG
        NSLog(@"PDFScrollView parseScreen -> screen annot has no title");
#endif
        //return;
    }else {
#if DEBUG
    char *screenTitle = (char*)CGPDFStringGetBytePtr(aTitle);

    NSLog(@"PDFScrollView parseScreen -> screen  title %s",screenTitle);
#endif
    }

    // get action
    CGPDFDictionaryRef actionDict;
    if(!CGPDFDictionaryGetDictionary(annotDict, "A", &actionDict)) {
        return;
    }

    // parse action

    const char* name;

    if (!CGPDFDictionaryGetName(actionDict, "S", &name)) {
#if DEBUG
        NSLog(@"PDFScrollView parseScreen -> screen annot has name attrib");
#endif
        return;
    }

    NSString *actionType = [[NSString alloc] initWithCString:name];

    if (![actionType isEqualToString:RENDITION_ACTION_TYPE]) {
#if DEBUG
        NSLog(@"PDFScrollView parseScreen -> screen annot action is not rendition");
#endif
        return;
    }
    [actionType release];
    actionType = nil;
    // get the rendition from the action dictionary
    CGPDFDictionaryRef renditionDict;
    if (!CGPDFDictionaryGetDictionary(actionDict, "R", &renditionDict)) {
#if DEBUG
        NSLog(@"PDFScrollView parseScreen -> rendition action does not have rendition");
#endif      
        return;
    }
    // check if the rendition is media or selector
    const char *renditionType;
    if (!CGPDFDictionaryGetName(renditionDict, "S", &renditionType)) {
#if DEBUG
        NSLog(@"PDFScrollView parseScreen -> rendition does not have type");
#endif  
        return;
    }
    // check rendition type
#if DEBUG
    NSLog( @"rendition type %s",renditionType);
#endif
    NSString *rendTypeString = [[NSString alloc]initWithCString:renditionType];
    if (![rendTypeString isEqualToString:@"MR"]) {
#if DEBUG
        NSLog( @"PDFScrollView parseScreen ->rendition type is not  MR --> %s",renditionType);
#endif
        return;

    }
    [rendTypeString release];
    rendTypeString =nil;
    // get media clip dictionary

    CGPDFDictionaryRef mediaclipDict;
    if (!CGPDFDictionaryGetDictionary(renditionDict, "C", &mediaclipDict)) {
#if DEBUG
        NSLog( @"PDFScrollView parseScreen ->rendition dictionary does not contain clip");
#endif
        return;
    }


    const char * mediaClipType;
    if (!CGPDFDictionaryGetName(mediaclipDict, "Type", &mediaClipType)) { // optional
#if DEBUG
        NSLog( @"PDFScrollView parseScreen ->media clip dictionary does not contain name");
#endif


    } else {
#if DEBUG
        NSLog(@"PDFScrollView parseScreen ->media clip object dictionary name %s",mediaClipType);
#endif
    }

    char const *mediaClipSubtype;
    if (!CGPDFDictionaryGetName(mediaclipDict, "S", &mediaClipSubtype)) { // required
#if DEBUG
        NSLog( @"PDFScrollView parseScreen ->media clip dictionary does not contain name");
#endif

        return;
    } 

    NSString *mediaClipSubtypeString = [[NSString alloc] initWithCString:mediaClipSubtype];
    if (![mediaClipSubtypeString isEqualToString:@"MCD"]) {
#if DEBUG
        NSLog( @"PDFScrollView parseScreen ->media clip subtype is not MCD ==>%@",mediaClipSubtype);
#endif      
        return;
    }

    [mediaClipSubtypeString release];
    mediaClipSubtype = nil;

    // get media clip name

    CGPDFStringRef mediaClipName;
    if (!CGPDFDictionaryGetString (mediaclipDict, "N", &mediaClipName)) { // optional
#if DEBUG
        NSLog( @"PDFScrollView parseScreen ->media clip dictionary does not contain name");
#endif

    } else {
#if DEBUG
        NSLog(@"PDFScrollView parseScreen ->media clip object dictionary name %s",mediaClipName);
#endif
    }

    // get ASCII MIME type
    CGPDFStringRef mimeType;
    if(!CGPDFDictionaryGetString(mediaclipDict, "CT", &mimeType)) {

#if DEBUG
        NSLog(@"PDFScrollView parseScreen ->media clip object does not contain mime type");
#endif
    }else {
#if DEBUG
        NSLog(@"PDFScrollView parseScreen ->media clip object mime type %s",CGPDFStringGetBytePtr(mimeType));
#endif
    }


    // get content stream

    CGPDFDictionaryRef contentDict ;
    if (!CGPDFDictionaryGetDictionary(mediaclipDict, "D", &contentDict)) {
#if DEBUG
        NSLog(@"PDFScrollView parseScreen ->media clip object does not contain content dict");

#endif
        return;
    }

    //check content type
    const char *contentType;

    if (!CGPDFDictionaryGetName(contentDict, "Type", &contentType)) {
#if DEBUG
        NSLog(@"PDFScrollView parseScreen ->content dict does not have type");

#endif      
    }else {

#if DEBUG
        NSLog(@"PDFScrollView parseScreen ->content dict type %s",contentType);

#endif
    }
    // get file system
    const char *fileSystem;
    if (!CGPDFDictionaryGetName(contentDict, "FS", &fileSystem)) {
#if DEBUG
        NSLog(@"PDFScrollView parseScreen ->content dict does not have type");

#endif      
    }else {

#if DEBUG
        NSLog(@"PDFScrollView parseScreen ->content dict type %s",fileSystem    );

#endif
    }

#if DEBUG 
    CGPDFStringRef description;
    if (!CGPDFDictionaryGetString (contentDict, "UF", &description)) {

        NSLog(@"PDFScrollView parseScreen ->content dict does not have type");


    }else {
        NSLog(@"PDFSCrollView parseScreen -> contentdict UF %s",CGPDFStringGetBytePtr(description));
    }

#endif
    // check whether it is a file specification
    if (strcmp(contentType, "Filespec")!=0) { // it is Filespec
#if DEBUG
        NSLog(@"PDFScrollView parseScreen ->content dict type %s is not file specification",contentType);

#endif
        return;
    }
    CGPDFStringRef fstring;// I get the file title 
    if (!CGPDFDictionaryGetString (contentDict, "F", &fstring)) {
#if DEBUG
        NSLog(@"PDFScrollView parseScreen ->content dict does not have F string");

#endif      
    }

#if DEBUG
        NSLog(@"PDFScrollView parseScreen ->content f string %s",CGPDFStringGetBytePtr(fstring));

#endif
    CGPDFStreamRef str;// here there's no stream at all
    if (!CGPDFDictionaryGetStream  (contentDict, "F", &str)) {
#if DEBUG
        NSLog(@"PDFScrollView parseScreen ->content dict does not have F string");

#endif      
    }

    // reference file
    CGPDFArrayRef referencedFileDict ;
    /*** does not find the RF ****/
    if (!CGPDFDictionaryGetArray (contentDict, "RF", &referencedFileDict)) {
#if DEBUG
        NSLog(@"PDFScrollView parseScreen ->content dict does not contain Referenced file dictionary RF");

#endif  
    }


    CGPDFDictionaryRef embeddedFileDict ;

    if (!CGPDFDictionaryGetDictionary(contentDict, "EF", &embeddedFileDict)) {
#if DEBUG
        NSLog(@"PDFScrollView parseScreen ->content dict does not contain embedded file dictionary EF");

#endif  
    }
    // EF is found sucessfully

    CGPDFDictionaryRef documentCatalog = CGPDFDocumentGetCatalog(_docRef);

    CGPDFDictionaryRef namesDict;


    if (!CGPDFDictionaryGetDictionary (documentCatalog, "Names", &namesDict)) { // optional
#if DEBUG
        NSLog( @"PDFScrollView parseScreen ->documentCatalog does not contain Names dict");
#endif
        //return;

    } 

    CGPDFDictionaryRef destsDict;


    if (!CGPDFDictionaryGetDictionary (documentCatalog, "Dests", &destsDict)) { // optional
#if DEBUG
        NSLog( @"PDFScrollView parseScreen ->documentCatalog does not contain Names dict");
#endif
        //return;

    } 


    CGPDFDictionaryRef embeddedFilesDict ;

    if (!CGPDFDictionaryGetDictionary(namesDict, "EmbeddedFiles", &embeddedFilesDict)) {
#if DEBUG
        NSLog( @"PDFScrollView parseScreen ->namesDict does not contain embeddedFiles dict");
#endif
        return;

    }

    CGPDFArrayRef namesArray;
    if (!CGPDFDictionaryGetArray(namesDict, "Names", &namesArray)) {
#if DEBUG
        NSLog( @"PDFScrollView parseScreen ->names Dict does not contain Names");
#endif
        return;

    }

console output for this

2011-02-05 01:59:35.324 xxxxxxxxxx[62350:207] -> parseLink annotation subtype Screen
2011-02-05 01:59:35.325 xxxxxxxxxx[62350:207]  parseScreen -> screen  title Annotation from Inception_HD.avi
2011-02-05 01:59:35.325 xxxxxxxxxx[62350:207] rendition type MR
2011-02-05 01:59:35.326 xxxxxxxxxx[62350:207]  parseScreen ->media clip dictionary does not contain name
2011-02-05 01:59:35.327 xxxxxxxxxx[62350:207]  parseScreen ->media clip object dictionary name 
2011-02-05 01:59:35.327 xxxxxxxxxx[62350:207]  parseScreen ->media clip object mime type video/avi
Current language:  auto; currently objective-c
2011-02-05 01:59:39.080 xxxxxxxxxx[62350:207]  parseScreen ->content dict type Filespec
2011-02-05 01:59:40.206 xxxxxxxxxx[62350:207]  parseScreen ->content dict does not have type
2011-02-05 01:59:41.647 xxxxxxxxxx[62350:207]  parseScreen -> contentdict UF Inception_HD.avi
2011-02-05 01:59:44.234 xxxxxxxxxx[62350:207]  parseScreen ->content f string Inception_HD.avi
2011-02-05 01:59:45.472 xxxxxxxxxx[62350:207]  parseScreen ->content dict does not have F string
2011-02-05 01:59:47.772 xxxxxxxxxx[62350:207]  parseScreen ->content dict does not contain Referenced file dictionary RF
(gdb) continue
2011-02-05 03:33:13.748 xxxxxxxxxx[62350:207]  parseScreen ->documentCatalog does not contain Names dict

My idea is to get the file stream, save it to a temp file and play it as adobe devs do with the adobe SDK.

Thanks in advance

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

娇纵 2024-10-23 00:27:44

我最终将媒体打包到 PDF 文件以外的其他文件上。我无法使用 Apple Quartz 库掌握 pdf 上的流数据。

I ended up packing media on other files than the PDF file. I couldn't get a grip on the stream data on the pdf using Apple Quartz libraries.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文