xml


Writing XML files for a KEGGREST database


Using the KEGGREST library, I'm trying to write 134 XML files, where each file represents a unique pathway. I used a simple loop to write each file, then I tried an lapply function but both write, or re-writes the xml files into one file rather than the 134 files in my working directory, and ultimately the file represents just the first pathway in the list (size 24KB).
The code below is reproducable:
library(KEGGREST)
pathways <- names(keggList("pathway", "mdm"))
To get the 'KGML' (XML) file simply use:
keggGet("pathway", "kgml")
where "pathway" takes a named pathway like "path:mdm04146", and outputs the kgml/xml file. When writing each file, there is an error that says: No encoding supplied: defaulting to UTF-8. In the reference manual, there is no attribute to change the encoding. However, I assume this is defaulting to a txt file format, and I know I can coerce it to the xml format.
To iterate over the 134 pathways to get each corresponding kgml pathway file and save it as an xml file, I initially tried getting the first three pathways as a test method, yet to no avail:
for (i in pathways[1:3]){
write.table(keggGet(i, "kgml"), file = paste(i, ".xml", sep = ""),
col.names=FALSE, row.names=FALSE, sep="\t", quote=FALSE)
}
Then I thought to first read it as a large text file, then write the xml file:
out.file<-""
for(i in pathways[1:3]){
file <- keggGet(i, "kgml")
out.file <- rbind(out.file, file)
}
write.table(out.file, file = paste(pathways[i], ".xml", sep = ""), sep =
"\t")
I know I can get the kgml file because it can open in R:
> keggGet("path:mdm04933", "kgml")
No encoding supplied: defaulting to UTF-8.
[1] "<?xml version=\"1.0\"?>\n<!DOCTYPE pathway SYSTEM
\"http://www.kegg.jp/kegg/xml/KGML_v0.7.2_.dtd\">\n<!-- Creation date: Nov
26, 2015 11:26:57 +0900 (GMT+9) -->\n<pathway name=\"path:mdm04933\"
org=\"mdm\" number=\"04933\"\n title=\"AGE-RAGE signaling pathway in
diabetic complications\"\n
image=\"http://www.kegg.jp/kegg/pathway/mdm/mdm04933.png\"\n
link=\"http://www.kegg.jp/kegg-bin/show_pathway?mdm04933\">\n <entry
id=\"1\" name=\"path:mdm04933\" type=\"map\"\n
link=\"http://www.kegg.jp/dbget-bin/www_bget?mdm04933\">\n <graphics
name=\"TITLE:AGE-RAGE signaling pathway in diabetic complications\"
fgcolor=\"#000000\" bgcolor=\"#FFFFFF\"\n ......................until the
end of the file.
I tested parts of my loop to make sure they work:
> for (i in pathways[1:3]){
+ print(i)
+ }
[1] "path:mdm00010"
[1] "path:mdm00020"
[1] "path:mdm00030"
> for(i in pathways[1:3]){
+ print(paste(pathways[i], ".xml", sep = ""))
+ }
[1] "path:mdm00010.xml"
[1] "path:mdm00020.xml"
[1] "path:mdm00030.xml"
Although I do get a single XML file I can see it iterate and process the 134 pathways into one file. However, after processing, it is size 24KB, so it only saved the first mdm00010 pathway, and iterestingly the file is not a named file. I can watch the KB size go up and down as it's 're-writing' the file...
Here is an lapply function to used to do this operation:
lapply(pathways[1:3],
function(i, pathways) write.table(keggGet(i, "kgml"), paste(pathways[i],
".xml", sep = ""),
col.names=FALSE, row.names=FALSE, sep="\t", quote=FALSE), pathways)
Lastly, I can save one xml file by using the XML package:
h <- xmlInternalTreeParse(keggGet("path:mdm04933", "kgml"))
saveXML(h, "try.xml").
However, when I try to loop it, it completes with no errors but no files are written:
for( i in pathways[1:3]){
XML::saveXML(XML::xmlInternalTreeParse(keggGet(i,
"kgml"), paste(i, ".xml", sep = "")))
}
Thanks for reading!! Any help is greatly appreciated to understand what's going on here.
Thanks.

Related Links

In DTDs, why are namespaces given as a URL?
Getting value of an xml element by name using Xdocument
Java DOM unable to recognize CDATA
Why does XML look differently in notepad++ and notepad? [closed]
XML/XSD White spaces are required between publicId and systemId
How to add Conditional Formatting to an XML / Excel file?
xslt 1.0 grouping with compound keys (at different levels)
Actionscript 2: Get an instance name from a variable string
XML file does not appear to have any style information associated with it warning and how to deal with it?
XML data storage options
Error in checkout - AddBodyClass
how to select attribute value of a node in xquery?
XSLT transformation using collection is missing context item
Overload extension methods of T
Permissions on XML execution missing?
XML Schema type alias?

Categories

HOME
service-worker
apache-jena
amp-html
formal-languages
jmx
jwplayer
composite-primary-key
requirements
cublas
soa
big-o
regular-language
zip
google-admin-sdk
missing-data
flume
character
ibeacon-android
nancy
cumulocity
symfony-2.8
standards
mathprog
wolframalpha
database-connection
drive
gmock
ckfinder
itunes
opentk
fragment-backstack
scatter3d
glib
clickjacking
hivemq
reactivekit
release
typesafe-config
opensmpp
p4merge
c9.io
watchface
apriori
apache-spark-dataset
flask-login
oim
strftime
openal
automapper-5
dotnet-httpclient
openblas
zenhub
semantic-logging
modelsim
filehandle
bbpress
mouseleave
mapxtreme
google-cloud-console
vulcanize
dotnetnuke-7
dup
http-status-code-401
kiosk
ikiwiki
np-spring
design-principles
xenomai
matlab-deployment
node-serialport
zend-currency
cosine-similarity
addressing-mode
device-tree
linkedin-jsapi
create.js
jqmodal
codeplex
file-not-found
spring-3
script#
booksleeve
os.system
oracle-enterprise-linux
saxparseexception
landscape-portrait
fsevents
massive
dataform
codebase
office-2007
data-mapping
int64
unmanagedresources
change-management

Resources

Encrypt Message