Click here to monitor SSC
Av rating:
Total votes: 12
Total comments: 8


Phil Factor
COM Automation of Office Applications via PowerShell
26 May 2011

There need be no shame in using Office by automating it via COM. It was designed to be used that way, and with PowerShell, the various Office applications can be used as glorious output devices for data. Phil Factor uses some practical examples to try to persuade you to take the plunge.

There is something rather satisfying in using Office applications via COM automation in order to cut corners in development work. It can be a very useful way of providing functionality very quickly, and there is a lot of use locked within these applications. In this article, I’ll be giving a few illustrations of what can be achieved, with scripts that I find useful. I’ll use PowerShell, but you can get the same effect in a similar way in any .NET language, but with rather more effort. I’ve stripped each script down to the bare minimum so as to make it possible to embed them into this article. We'll end by automatically producing a database build script within a Word document and drawing an Excel graph from the raw data. We'll start a bit simpler, though.

Of course, we’ll use automation for silly reasons, such as getting a simple scripted way of generating speech

$Excel = New-Object -Com Excel.Application

$Excel.Visible = $false

$Excel.speech.speak('You have failed me, for the last time Admiral.')

$Excel.quit()

…or animating the awful animated paper-clip. (In the interests of public safety I won't show you how to do this, but it is great for livening up Powershell presentations. See Jeffrey Snover's code to do this, on page 412 0f Bruce Payette's book 'Powershell in Action'. )

For simple PowerShell scripting of Word and Excel, you’d probably prefer to use Out-MSWord by Jeffery Hicks (OUT-MSWord Revised ) for using MS Word as means of  output. and the equivalent IMPORT-EXCEL and EXPORT-EXCEL However, we’ll try to go beyond this simple sort of usage.

Scripting MS Word

Microsoft Word is an obvious choice for automation because it is a glorious means of providing output. I once designed a web- application that emailed beautiful PDF invoices using Word. (it read in an HTML file generated by the application). I had it running within hours of an accountant embarrassingly asking ‘how does it generate invoices and credit notes’ at a presentation. After making it robust and auditable, it remained for the life of the application.

 

We’ll start off with taking a webpage and saving this as a Word document. Actually, for this example we’ll save a whole list of articles from Simple-Talk and save them with all their formatting.

$DirectoryToSaveTo='s:\work\documents\Test\'

$Word = New-Object -Com Word.Application

$Word.Visible = $false #set this to true for debugging

if (!(Test-Path -path "$DirectoryToSaveTo"))#create it if not existing

  {

  New-Item "$DirectoryToSaveTo" -type directory | out-null

  }

$wdFormatXMLDocument=12 # http://msdn.microsoft.com/en-us/library/bb238158%28v=office.12%29.aspx

# you might want to save it in wdFormatDocument  (value=0) with older versions

foreach ($ArticleID in   @("1233","1291","1289","1290","1288","1287","1286"))

  {

  $Doc = $Word.Documents.Open("http://www.simple-talk.com/content/print.aspx?article=$ArticleID")

  $filename = $DirectoryToSaveTo+"Simple-Talk ("+$ArticleID+').docx' #save it according to its title

  if (test-path $filename) { rm $filename } #delete the file if it already exists

  $Doc.SaveAs([ref]$filename, [ref]$wdFormatXMLDocument)

  $Doc.Close() #close the document

  }

$Word.Quit() #and the instance of Word

Of course, you can save in a variety of formats, such as HTML, Text or PDF (if you have the necessary filter) Although this works pretty well, there is a quirk of Word that means that it sometimes makes a better job of the formatting and conversion if the document is pasted from Internet Explorer. OK. we can cope. By modifying the routine, It gives us some advantages because we can get more information about the file at the same time, such as the title of the document.

$DirectoryToSaveTo='s:\work\documents\Test\'

$Word = New-Object -Com Word.Application

$Exploder = New-Object -Com InternetExplorer.Application

$Word.Visible = $true

$Exploder.Visible=$true

if (!(Test-Path -path "$DirectoryToSaveTo"))#create it if not existing

  {

  New-Item "$DirectoryToSaveTo" -type directory | out-null

  }

foreach ($ArticleID in   @("1233","1291","1289","1290","1288","1287","1286"))

  {

  $Doc = $Word.Documents.Add() #create a new document

  $Exploder.Navigate("http://www.simple-talk.com/content/print.aspx?article=$ArticleID")

  while ($Exploder.Busy) {} #this should be done with a listener on the onload event

  #OK, so we can now declare some constants.

  $OLECMDID_SELECTALL=17 # see http://msdn.microsoft.com/en-us/library/ms691264%28v=vs.85%29.aspx

  $OLECMDID_COPY=12 # see http://msdn.microsoft.com/en-us/library/ms691264%28v=vs.85%29.aspx

  $wdFormatOriginalFormatting=16 #see http://msdn.microsoft.com/en-us/library/bb237976%28v=office.12%29.aspx

  $Exploder.ExecWB($OLECMDID_SELECTALL,0,$null,[ref]$null) #select all the page

  $Exploder.ExecWB($OLECMDID_COPY,0,$null,[ref]$null) #and copy the selection to the clipboard

        $filename=$Exploder.Document.Title

  $filename=$filename -replace  '[\\\/\:\.]',' ' #remove characters that can cause problems

  $Word.Selection.PasteAndFormat($wdFormatOriginalFormatting) # Preserves original formatting of the pasted material.

  $filename = $DirectoryToSaveTo+"$filename ("+$ArticleID+').docx' #save it according to its title

  if (test-path $filename ) { rm $filename } #delete the file if it already exists

  $Doc.SaveAs([ref]$filename)

  $Doc.Close()#close the document

  }

$Word.Quit() #and the instance of Word

So what have we done? We’ve automated two key Microsoft products, Internet Explorer, and Microsoft Word. I’ve just taken a manual process and scripted it

The script....

  • Checks whether  the directory where you want to save the Word files actually exists, and if not it then creates it.
  • Fires up internet Explorer
  • Fires up MS Word
  • navigates to the correct page
  • Finds the next article number of the article you want to save as a word file
  • Creates a new Word file
  • Works out he URL and navigates to the page using IE
  • Waits until the page loads
  • Select all
  • Copy the whole rendered page onto the clipboard
  • Pastes the rendered page into Microsoft Word
  • Works out the title of the file
  • Checks to see if it is already saved in the directory. If so it deletes it
  • Saves the file and closes it
  • Finds the next article to save
  • If nothing else to save, quits

Before you creating a script like this, it pays to be very clear about what the process consists of. If the process is using Word, Excel or Access, you can record a manual process as VBA (Visual Basic for Applications), and convert the VBScript files to Powershell. Other components such as IE take more fishing around in Books-On-Line to use.

Why do this elaborate way when MSWord's own file-conversion isn't too bad? The answer is that you may want to assemble parts of different websites, various images, or a collection of HTML fragments,  to create a document. Typically, it would be in making an integrated report from a number of places on an intranet. You can, instead of selecting the whole page, select just a part of it, and by selecting and pasting from various places you can concatenate a document. I use this technique to save tables, scripts and results that are already styled and formatted as XHTML fragments into a document. To do this in Word is tedious even if it makes less complicated word files, but who cares. It is quicker for me to do the formatting I want to data and just paste in each fragment to make up the word document. The recipient of the report never seems to complain. You can read fragments into IE or put the parts of the DOM you want onto the clipboard.

You can create HTML, text or PDF versions of your MSWord files very easily. Here, a word file is being saved as a rather voluminous HTML file. I use this technique to save all my word files in an intranet site so I can look through documents rapidly, and read them on a tablet..

$document='MyImportantFile.Docx'

$Word = New-Object -Com Word.Application

$Word.Visible = $false #set to 'true' for debugging!

if (!(test-path $document)) { "No such file as $document"

                                   break } # does it exist?

$existingDoc=$word.Documents.Open($document)

$saveaspath = $document.Replace('.docx','.HTML')

<# When you save a Microsoft Word document as HTML, Word displays the Web page similar to the way it will appear in a Web browser. Formatting and other items that are not supported by HTML or the Web page authoring environment are removed from the file #>

$wdFormatHTML = [ref] 8 #http://msdn.microsoft.com/en-us/library/bb238158%28v=office.12%29.aspx

if (test-path $saveaspath) { rm $saveaspath} #delete the output file if it already exists

$existingDoc.SaveAs( [ref] $saveaspath,$wdFormatHTML )

$existingDoc.Close()

$Word.Quit()

To turn to a more practical usage of  Word automation, here is a cut-down version of a build-script generator I use to save SQL Server database build scripts as MS Word files. As generating a build script is rather slow, it is best done on the back burner, by running it on the scheduler. I also have a version in SQL Scripts Manager, for doing ad-hoc build scripts.

$DirectoryToSaveTo='MyDirectory'

$servername='MyServer'

$database='MyDatabase'

if (!( Test-Path -path "$DirectoryToSaveTo" )) #create it if not existing

  { New-Item "$DirectoryToSaveTo" -type directory | out-null }

  

$filename="$servername $database" -replace  '[\\\/\:\.]',' ' #remove characters that can cause problems

$filename=$DirectoryToSaveTo+$filename

# Script all the objects in the specified database to a word document

# Load SMO assembly, and  SMOExtended

$v = [System.Reflection.Assembly]::LoadWithPartialName( 'Microsoft.SqlServer.SMO')

if ((($v.FullName.Split(','))[1].Split('='))[1].Split('.')[0] -ne '9') {

  [System.Reflection.Assembly]::LoadWithPartialName('Microsoft.SqlServer.SMOExtended') | out-null

}

$s = new-object ('Microsoft.SqlServer.Management.Smo.Server') $servername

$db = $s.databases[$database]

$dbname = $db.Name

$script  = new-object ('Microsoft.SqlServer.Management.Smo.Transfer') ($db)

$script.copyAllObjects = $true #we are going to script everything

$script.Options.ScriptBatchTerminator = $true

$script.copySchema=$true

$sc=$script.scriptTransfer()

"we now have the script. The time has come to write it to Word"

$Word = New-Object -Com Word.Application

$Word.Visible = $true

$Doc = $Word.Documents.Add()

$TitlePara = $Doc.Paragraphs.Add()

$TitlePara.Range.Style = "Heading 1"

$TitlePara.Range.Text = "/* Build Script for $dbname on $servername */"

$TitlePara.Range.InsertParagraphAfter()

foreach ($paragraph in $sc)  {

  $Para2 = $Doc.Paragraphs.Add()

  $Para2.Range.Text = $paragraph

  $Para2.Range.InsertParagraphAfter()

  }

$Word.Selection.WholeStory()

$Word.Selection.paragraphFormat.SpaceBefore = 0

$Word.Selection.paragraphFormat.SpaceAfter = 0

if (test-path $filename+'.docx' ) { rm $filename+'.docx'} #delete the output file if it already exists

if (test-path $filename+'.doc' ) { rm $filename+'.doc'} #even in the old format.

$Doc.SaveAs([ref]$filename)

$Doc.Close()

$Word.Quit()

"we have completed the task, master."

Just page 115 of 225! The build script in a Word document. Shouldn't it be in color?

You'll see that  the process is extraordinarily simple. As everything is formatted the same way, I do that at the end of the import, simply by highlighting everything and formatting everything at once; automatically, of course.

Automating Excel

For me, the easiest way of getting data into Excel is by pasting it in via Internet Explorer, or reading it in as an HTML table fragment.  It can be done by iterating through the cells via script, but it is a relatively slow way of doing it. If you have a dataSet or dataTable, you can convert it to an HTML table very easily in PowerShell. There are other ways such as writing to the ExcelXML format via the .NET SDK.  No, me neither, I'll stick to the wild way. Here is a script that illustrates the 'wild man' approach to pasting data into Excel. it is almost identical to the technique I used with Word.

$DirectoryToSaveTo='s:\work\spreadsheets\Test\'

$Filename='Quotations'

$Excel = New-Object -Com Excel.Application

$Exploder = New-Object -Com InternetExplorer.Application

$excel.Visible = $True

$Exploder.Visible=$True

if (!(Test-Path -path "$DirectoryToSaveTo"))#create it if not existing

  {

  New-Item "$DirectoryToSaveTo" -type directory | out-null

  }

$wb = $Excel.Workbooks.Add()

$ws = $wb.Worksheets.Item(1)

$Exploder.Navigate("http://www.simple-talk.com/blogbits/philf/quotations4.html")

while ($Exploder.Busy) {} #this should really be done with a listener on the onload event

#OK, so we can now declare some constants.

$OLECMDID_SELECTALL=17 # see http://msdn.microsoft.com/en-us/library/ms691264%28v=vs.85%29.aspx

$OLECMDID_COPY=12 # see http://msdn.microsoft.com/en-us/library/ms691264%28v=vs.85%29.aspx

$Exploder.ExecWB($OLECMDID_SELECTALL,0,$null,[ref]$null) #select all the page

$Exploder.ExecWB($OLECMDID_COPY,0,$null,[ref]$null) #and copy the selection to the clipboard

$xlbottom=[int]-4107 #http://msdn.microsoft.com/en-us/library/microsoft.office.interop.excel.constants%28v=office.14%29.aspx

$xlRight=[int]-4152 #http://msdn.microsoft.com/en-us/library/microsoft.office.interop.excel.constants%28v=office.14%29.aspx

$xlContext=[int]-5002

$xlOpenXMLWorkbook=[int]51 #http://msdn.microsoft.com/en-us/library/bb241279%28v=office.12%29.aspx

$ws.columns.item("A:A").ColumnWidth = 50

$ws.columns.item("B:B").ColumnWidth = 25

$excel.Range("A1").Select()

$excel.ActiveSheet.Paste()

$ws.columns.item("A:A").Select()

$excel.Selection.Font.Italic = $True

$wb = $Null #set all variables that point to Excel objects to null

$ws = $Null

$Excel=$Null

# hristo deshev's excel trick Pro Windows Powershell p380

[GC]::Collect()"

Our data (quotes in this case, is imported into Excel and formatted automatically

You'll have noticed that, just to show that we have been able to get in and format the columns that we've entered into Excel, we've changed the widths of the columns and made on italic, and the other one aligned bottom right. Excel has auto-detected the datatype of the values, and generally seems to get this right. (It sometimes needs a little help with dates)

Having proved that it is easy to get data into Excel automatically, we ought to do something more serious with it. In the next example, we'll read in the weather data for Cambridge (UK) from 1961 to 2010. Then we'll create two new computed columns, the first to hold the date, and the second to hold the rainfall in inches rather than the new-fangled centimeters.

Having done that, we'll produce a graph and add a few features, such as a trend-line (the good news is there is no significant change in the rainfall figures)

The objective of all this is to prove that one can generate graphical reports from a simple data set with just a small amount of automation, but with more versatility than one can get from reporting services.

$DirectoryToSaveTo='MyDirectory\'

$Filename='CambridgeRainfall'

$Excel = New-Object -Com Excel.Application

$excel.Visible = $True

if (!(Test-Path -path "$DirectoryToSaveTo"))#create it if not existing

  {

  New-Item "$DirectoryToSaveTo" -type directory | out-null

  }

$wb = $Excel.Workbooks.Open("http://www.simple-talk.com/blogbits/philf/rainfall.html")

$xlbottom=[int]-4107 #http://msdn.microsoft.com/en-us/library/microsoft.office.interop.excel.constants%28v=office.14%29.aspx

$xlRight=[int]-4152 #http://msdn.microsoft.com/en-us/library/microsoft.office.interop.excel.constants%28v=office.14%29.aspx

$xlContext=[int]-5002

$xlBarStacked = 58

$xlCategory = [int]1 #http://msdn.microsoft.com/en-us/library/ff198060.aspx

$xlValue=[int]2 #http://msdn.microsoft.com/en-us/library/ff198060.aspx

$xlPrimary=[int]1 #http://msdn.microsoft.com/en-us/library/ff196160.aspx

$xlColumnClustered   = [int]51

$msoScaleFromTopLeft=0 #from http://msdn.microsoft.com/en-us/library/aa432670

$msoScaleFromBottomRight= 2 #from http://msdn.microsoft.com/en-us/library/aa432670

$msoElementPrimaryCategoryAxisTitleAdjacentToAxis=[int]301 #http://msdn.microsoft.com/en-us/library/ff864118.aspx

$msoElementPrimaryCategoryAxisTitleRotated=[int]309 #http://msdn.microsoft.com/en-us/library/ff864118.aspx

$xlMovingAvg=6 #http://msdn.microsoft.com/en-us/library/ff192956.aspx

$xlOpenXMLWorkbook=[int]51 #http://msdn.microsoft.com/en-us/library/bb241279%28v=office.12%29.aspx

$ws=$Excel.ActiveSheet

#we now create two calculated columns from the date

#the first co calculate the excel date and the other to convert cm to inches

$excel.Range("H2").Select()

$excel.ActiveCell.FormulaR1C1 = 'Date'

$excel.Range("H3").Select()

$excel.ActiveCell.FormulaR1C1 = '=DATE(RC[-7],RC[-6],1)'

$excel.Range("I2").Select()

$excel.ActiveCell.FormulaR1C1 = "Rain (in)"

$excel.Range("I3").Select()

$excel.ActiveCell.FormulaR1C1 = '=RC[-3]*0.0393700787'

$excel.Range("H3:I596").Select()

$ws=$excel.Worksheets.add([System.Reflection.Missing]::Value,$ws)

$ch = $ws.shapes.addChart().chart

$ch.ChartType = $xlColumnClustered

$range=$excel.Range('Rainfall!$H$2:$I$596')

$ch.SetSourceData($range)

$ch.HasLegend = $False

$ch.ChartTitle.Text = "Rainfall in Cambridge 1961 - 2010"

$excel.ActiveChart.ChartArea.Select

$shape=$ws.Shapes.item("Chart 1")

$shape.ScaleWidth(2.4770833333, $False, $msoScaleFromBottomRight)

$shape.ScaleHeight( 2.1614581511,  $False, $msoScaleFromBottomRight)

$shape.ScaleWidth( 1.0218671152,  $False, $msoScaleFromTopLeft)

$shape.ScaleHeight( 1.3054874177,  $False, $msoScaleFromTopLeft)

$ch.Axes($xlCategory).MajorUnit = 12

$ch.Axes($xlCategory).TickLabels.NumberFormat = "yyyy;@"

$ch.ChartGroups(1).Overlap = -100

$ch.ChartGroups(1).GapWidth = 0

$ch.ChartGroups(1).varyByCategories = $False

$ch.SeriesCollection(1).Format.Fill.Transparency = 0.599999994

$trendline=$ch.SeriesCollection(1).Trendlines().Add($xlMovingAvg)

$trendline.Type = $xlMovingAvg

$trendline.Period = 12

$trendline.Format.Line.ForeColor.RGB = 255 #http://technet.microsoft.com/en-us/library/ee176944.aspx

$trendline.Format.Line.Weight = 1.75

$ch.SetElement($msoElementPrimaryCategoryAxisTitleAdjacentToAxis)

$ch.Axes($xlCategory, $xlPrimary).AxisTitle.Text = 'Date'

$filename=$filename -replace  '[\\\/\:\.]',' ' #remove characters that can cause problems

$filename = $DirectoryToSaveTo+$filename+'.xlsx' #save it according to its title

if (test-path $filename ) { rm $filename } #delete the file if it already exists

$wb.SaveAs($filename,  $xlOpenXMLWorkbook)

$wb.Saved = $True

$wb.Close() #close the document

$Excel.Quit() #and the instance of Excel

$wb = $Null #set all variables that point to Excel objects to null

$ws = $Null

$Excel=$Null

# Hristo Deshev's Excel trick 'Pro Windows Powershell' p380

[GC]::Collect()

Here is the imported data and the two added columns that calculate the date and the rainfall in inches

The automatically-generated Excel graph of rainfall. Click on it to see it full-size.

A few traps for the unwary.

Most automation examples are in VBS. Not many have been converted to PowerShell yet. None of the Office applications will record your actions as PowerShell scripts yet, which is odd, considering Microsoft's keen adoption of PowerShell: You are stuck with VBA. There are one or two things that can cause puzzlement.

 COM collections are indexed through the Item parameterised-property. Visual Basic uses that Item property as the default if you don't specify it. PowerShell doesn't, because the general .NET interop method isn't perfect, so occasionally you have to add it. PowerShell uses brackets instead of Visual Basic's parentheses for indexing collections. COM collections follow the Microsoft standard of indexing items in a collection from 1 to n whereas  .NET and PowerShell collections use the index 0- n-1  for that. Another problem you'll hit is that methods in VBA don't require the empty parentheses, whereas in PowerShell, they do. You'll spot this problem if you start getting Powershell listing the properties of the object rather than doing the method!

More awkward than this is the problem that the recorded scripts have to follow the logic of the way you approach the GUI. Usually, you work by selecting the object you want to work with and then manipulate its attributes via a series of dialog boxes. This means that scripts tend to work on the same approach even when it isn't the quickest and most efficient way of automating it. I find it is better to get something running fairly quickly and then rework the code to walk the object model rather than work on the selected object.

Sometimes, the documentation you find is pretty rudimentary. Microsoft technical authors for  Office Automation are notoriously tight-lipped. Fortunately, Powershell is very good at telling you what is available at any point. An easy way to get to grips with a COM object is to get its members. This is a way of doing it. In this example, we are getting the members exposed for COM automation by Internet Explorer.

$Exploder = New-Object -Com InternetExplorer.Application

$exploder | Get-Member| ConvertTo-HTML

$exploder.quit()

...Which gives the result...

  Name MemberType Definition
  ClientToWindow Method void ClientToWindow (int, int)
  ExecWB Method void ExecWB (OLECMDID, OLECMDEXECOPT, Variant, Variant)
  GetProperty Method Variant GetProperty (string)
  GoBack Method void GoBack ()
  GoForward Method void GoForward ()
  GoHome Method void GoHome ()
  GoSearch Method void GoSearch ()
  Navigate Method void Navigate (string, Variant, Variant, Variant, Variant)
  Navigate2 Method void Navigate2 (Variant, Variant, Variant, Variant, Variant)
  PutProperty Method void PutProperty (string, Variant)
  QueryStatusWB Method OLECMDF QueryStatusWB (OLECMDID)
  Quit Method void Quit ()
  Refresh Method void Refresh ()
  Refresh2 Method void Refresh2 (Variant)
  ShowBrowserBar Method void ShowBrowserBar (Variant, Variant, Variant)
  Stop Method void Stop ()
  AddressBar Property bool AddressBar () {get} {set}
  Application Property IDispatch Application () {get}
  Busy Property bool Busy () {get}
  Container Property IDispatch Container () {get}
  Document Property IDispatch Document () {get}
  FullName Property string FullName () {get}
  FullScreen Property bool FullScreen () {get} {set}
  Height Property int Height () {get} {set}
  HWND Property int HWND () {get}
  Left Property int Left () {get} {set}
  LocationName Property string LocationName () {get}
  LocationURL Property string LocationURL () {get}
  MenuBar Property bool MenuBar () {get} {set}
  Name Property string Name () {get}
  Offline Property bool Offline () {get} {set}
  Parent Property IDispatch Parent () {get}
  Path Property string Path () {get}
  ReadyState Property tagREADYSTATE ReadyState () {get}
  RegisterAsBrowser Property bool RegisterAsBrowser () {get} {set}
  RegisterAsDropTarget Property bool RegisterAsDropTarget () {get} {set}
  Resizable Property bool Resizable () {get} {set}
  Silent Property bool Silent () {get} {set}
  StatusBar Property bool StatusBar () {get} {set}
  StatusText Property string StatusText () {get} {set}
  TheaterMode Property bool TheaterMode () {get} {set}
  ToolBar Property int ToolBar () {get} {set}
  Top Property int Top () {get} {set}
  TopLevelContainer Property bool TopLevelContainer () {get}
  Type Property string Type () {get}
  Visible Property bool Visible () {get} {set}
  Width Property int Width () {get} {set}

you may get a "Old format or invalid type library" error when automating Excel. the problem is in your Windows regional settings. If the client computer runs the English version of Excel and the locale for the current user is configured for a language other than English, Excel will try to locate the language pack for the configured language. If the language pack is not found, the error is reported. To fix this problem, it is best to to install the multilingual user interface pack. (See Microsoft Knowledge Base article http://support.microsoft.com/default.aspx?scid=kb;en-us;320369 )

Further reading.

For COM automation of office applications, the internet is a frustrating source of wisdom and insight. There are, however, some very good Powershell books by people who know PowerShell very well, with chapters on COM automation.

  • Pro Windows PowerShell  by Hristo Deshev
  • Windows PowerShell in action by Bruce Payette
  • Windows PowerShell Cookbook by Lee Holmes

Because it is reasonably searchable, the Technet script repository is a good source of ideas, though the quality of script will vary!  There are some other archives with very good COM scripts to Office applications in them, but this is the first port of call.



This article has been viewed 7788 times.
Phil Factor

Author profile: Phil Factor

Phil Factor (real name withheld to protect the guilty), aka Database Mole, has 25 years of experience with database-intensive applications. Despite having once been shouted at by a furious Bill Gates at an exhibition in the early 1980s, he has remained resolutely anonymous throughout his career. See also :

To translate this article...

Search for other articles by Phil Factor

Rate this article:   Avg rating: from a total of 12 votes.


Poor

OK

Good

Great

Must read
 
Have Your Say
Do you have an opinion on this article? Then add your comment below:
You must be logged in to post to this forum

Click here to log in.


Subject: Nice Piece of Work
Posted by: MVV (view profile)
Posted on: Monday, May 30, 2011 at 7:13 AM
Message: Very didactic , how you can do so much with so little.

I'm trying to expand on your example by saving the simple-talk articles directly in PDF format (office 2007) , but i fail

I tried
$Doc.ExportAsFixedFormat($nombreArchivo, $paramExportFormat , $paramOpenAfterExport, $paramExportOptimizeFor, $paramExportRange, $paramStartPage, $paramEndPage, $paramExportItem, $paramIncludeDocProps, $paramKeepIRM, $paramCreateBookmarks, $paramDocStructureTags, $paramBitmapMissingFonts, $paramUseISO19005_1)

and

$Doc.SaveAs( [ref] $nombreArchivo, $paramExportFormat )
with $paramExportFormat value of 17 as listed on http://msdn.microsoft.com/en-us/library/bb238158%28v=office.12%29.aspx

but there is no way to automate the saving in fixed format (ie , PDF)
Any ideas?

Subject: Re: Saving in PDF
Posted by: Phil Factor (view profile)
Posted on: Monday, May 30, 2011 at 1:34 PM
Message: Yes. It is certainly possible. Have you got the PDF filters installed? In Office 2007, you can download and install a 'Save As PDF' add-in allowing you to save directly to PDF.
Download details: 2007 Microsoft Office Add-in: Microsoft Save as PDF http://bit.ly/kskUNS
There are plenty of examples of using it on the Web.

Subject: Saving in PDF
Posted by: MVV (view profile)
Posted on: Tuesday, May 31, 2011 at 4:06 AM
Message: Thank you for your answer.
In fact i had already installed the addon to save the fixed formats pdf and xfs. I think the problem may be more related to using WindowsXP and PowerShell v1.
I'll give it another go later.

Subject: Intriguing
Posted by: Anonymous (not signed in)
Posted on: Tuesday, May 31, 2011 at 7:39 AM
Message: I have recently started dabbling in Power Shell myself.

You bring a refreshing perspective on what is possible.

Subject: Compute By results into Excel
Posted by: Bill Ross (view profile)
Posted on: Tuesday, May 31, 2011 at 9:31 AM
Message: I've just discovered the joys of the COMPUTE BY clause of a SELECT statement. Is there a way to have the results routed conveniently to Excel, all on one tab of one spreadsheet?

Many thanks...

Subject: Re: Compute by
Posted by: Phil Factor (view profile)
Posted on: Tuesday, May 31, 2011 at 9:55 AM
Message: The COMPUTE BY clause is rather cute but designed for teletype printout and likely to eventually be deprecated, but you'd probably never need to route the results to Excel unless you particularly wanted to since Excel has it's own rather better way of doing the same thing.
you can, of couse, send the row data to excel and add the compute and sum fields. You can calculate the sums and totals within the same result, using ROLLUP and send them to Excel (If you just wanted the summaries, have a look at what we do in the http://www.simple-talk.com/sql/t-sql-programming/crosstab-pivot-table-workbench/ . You can use the technique I show here to do some pretty good-looking charts)
For the most flexibility,you can send the data to Excel and then do a PivotTable rotation on it in a script. The quickest way of scripting this is to set a macro to record, and record the manual process. Then turn the VBA script into Powershell and COM. It is pretty quick and painless, though in my experience, there is always some fiddly thing that refuses to work.

Subject: PDF Export
Posted by: MVV (view profile)
Posted on: Tuesday, May 31, 2011 at 12:16 PM
Message: ok , i got it working with
$Doc.ExportAsFixedFormat($filename,$p_ExportFormat , $p_OpenAfterExport, $p_ExportOptimizeFor, $p_ExportRange, $p_StartPage, $p_EndPage, $p_ExportItem, $p_IncludeDocProps, $p_KeepIRM, $p_CreateBookmarks, $p_DocStructureTags, $p_BitmapMissingFonts, $p_UseISO19005_1)
$Doc.Saved = $True #Dont nag me about saving changes
$Doc.Close()#close the document

using for parameters this
$p_ExportFilePath = ""
$p_ExportFormat = 17 #WdExportFormat.wdExportFormatPDF, ver http://msdn.microsoft.com/en-us/library/bb243311.aspx
$p_OpenAfterExport = 0
$p_ExportOptimizeFor = 0 #WdExportOptimizeFor.wdExportOptimizeForPrint , ver http://msdn.microsoft.com/en-us/library/bb243313.aspx
$p_ExportRange = 0 # WdExportRange.wdExportAllDocument , ver http://msdn.microsoft.com/en-us/library/bb243314.aspx
$p_StartPage = 1
$p_EndPage = 1
$p_ExportItem =0 #WdExportItem.wdExportDocumentContent , ver http://msdn.microsoft.com/en-us/library/bb243312.aspx
$p_IncludeDocProps = 1
$p_KeepIRM = 1
$p_CreateBookmarks =0 #2 # WdExportCreateBookmarks.wdExportCreateWordBookmarks , ver http://msdn.microsoft.com/en-us/library/bb243310.aspx
$p_DocStructureTags = 1
$p_BitmapMissingFonts = 1
$p_UseISO19005_1 = 0

And it works just fine. No nags about saving the exported document and good quality of conversion. Just to ice the cake , i'll see how to set each export on a different thread, if possible.
Thanks a lot for your article ;)

Subject: Conversion of Word documents
Posted by: zencop (view profile)
Posted on: Tuesday, January 03, 2012 at 8:39 AM
Message: Good morning,

I came across your sp to convert Word documents to text and unfortunately am getting several errors (range.text) and it could be there version of Microsoft Office that I have installed, which is 2000. However, I did a search of your profile and looked at what other things that you wrote and then I found this.

I looked at the piece in regards to converting Word documents to HTML and I am pretty new at Powershell (I will be picking up a book today if not tomorrow). I am just wondering whether the code you supplied require the latest version of office/word in order to work.

Essentially, I am building a document search system using SQL Server and the FullTextSearch. Whilst I discovered both how to do it with this or Indexing Server, the reality is that it won't extract and show the paragraph with the relative search terms in it, so I need to convert the documents to straight text and also store them as well.

Thanks for anything that anyone might be able to share.

 






recommended site pinvoke

PInvoke.net is a user-driven wiki which provides .NET developers with native method signatures, so they don't have to spend time writing them from scratch.




Top rated articles
C# Async: What is it, and how does it work?
 The biggest new feature in C#5 is Async, and its associated Await (contextual) keyword. Anybody who is... Read more...

Towards the Perfect Build
 An automated build and deployment system is no longer a dream. Now that PowerShell has matured as a... Read more...

Practical PowerShell: Pruning File Trees and Extending Cmdlets
 One of the most radical features of PowerShell is amongst the least known. It is possible to extend... Read more...

TortoiseSVN and Subversion Cookbook Part 4: Sharing Common Code
 Michael Sorens continues his series on source control with Subversion and TortoiseSVN by describing... Read more...

Feature Usage Reporting in Early Access Programs
 After doing Web development, you can get very used to the luxury of having basic information about your... Read more...

Most viewed articles
A Complete URL Rewriting Solution for ASP.NET 2.0
 Ever wondered whether it's possible to create neater URLS, free of bulky Query String parameters?... Read more...

Visual Studio Setup - projects and custom actions
 This article describes the kinds of custom actions that can be used in your Visual Studio setup project. Read more...

.NET Application Architecture: the Data Access Layer
 Find out how to design a robust data access layer for your .NET applications. Read more...

Web Parts in ASP.NET 2.0
 Most Web Parts implementations allow users to create a single portal page where they can personalize... Read more...

Calling Cross Domain Web Services in AJAX
 The latest craze for mashups involves making cross-domain calls to Web Services from APIs made publicly... Read more...

Over 400,000 Microsoft professionals subscribe to the Simple-Talk technical journal. Join today, it's fast, simple, free and secure.

Join Simple Talk